EventFlow
EventFlow copied to clipboard
Import a large number of events
Hi team,
My team is considering to use Eventflow for our project.
- Problem is we need to import large events to SqlServer Event Store, so we're writing a console application to load data from existed tables and send command to build events.
- We always update large data into database (about 2-3 milion records by csv file) once a week.
Our database has a lot records, about 30 milions records for all tables and it will get more in next years
I think the best way is disable ReadModels
feature, we just need insert events into EventStore and replay them. But if we just insert them one by one then it will very slow.
I also think to use table-value-param to insert batch event into EventStore table, table-value param is fastest way that I know.
- Can anyone show me how to implement it (with table-value param) or suggestion ?
- Did we have any option that disable
ReadModel
feature? - Should I design to use 1 table for 1 aggregrateRoot to decrease event version on each aggregateRoot?
Thank you!
I'm not at a computer right now, but for the table value you can use the built-in that EventFlow has implemented and look at the implementation from the EventFlow code base.
I'll see if I have any input to the rest when I'm at a computer.
@minhhungit Got computer access
Just wanna give you a few ideas, weather or not any of them are good in your situation depends on your application and your preference.
- You could make your own implementation of the
MsSqlEventPersistence
and use that on your existing event store, then you wouldn't need to migrate anything. You would need to implement theEventJsonSerializer
as well. Don't know if its easy, but it might be something to consider as it will allow you to go back - The table value type in EventFlow is called
TableParameter<>
and can be used directly viaIMsSqlConnection.InsertMultipleAsync<long, EventDataModel>(...)
. The SQL for the type is here https://github.com/eventflow/EventFlow/blob/develop/Source/EventFlow.MsSql/EventStores/Scripts/0002%20-%20Create%20eventdatamodel_list_type.sql and the wrapper that handles the SQL magic is calledTableParameter<>
in the EventFlow code base. Your console application could use that. Not sure about the performance though, haven't tested it. - Disabling the read model feature is rather simple, just don't use it 😄 its opt-in. If you want to disable it further though, you could create an empty implementation of the
IDomainEventPublisher
and register that in the IoC container - I'm not sure what you mean about your last question regarding the event version for each aggregate root. There's two concepts for the MSSQL EventFlow implementation, the global sequence number and the aggregate sequence number. The global one isn't used by the aggregate and if merely used for replay of all events. The aggregate sequence number is an
int
, so that should make it possible to have 2.147.483.647 events on _each_aggregate. The global one is along
. - If you have aggregates with many events (500+), consider using snapshots (http://docs.geteventflow.net/Snapshots.html) to improve the performance
- Check if the event upgrade feature might be useful to deprecate old events, remember to keep the old definitions around (http://docs.geteventflow.net/EventUpgrade.html)
Remember that you can create custom implementations of everything in EventFlow. So if you have specific needs, use the EventFlow implementation as a template and then create your own. Maybe event use some of the tests suites fond in the EventFlow code base.
If you discover any performance bottlenecks if you very much like to hear about it.
Hope that helps
P.S.: If you can tell me about the how the migration worked, what you did, what worked and what didn't, especially if there's some areas were EventFlow was cumbersome to work with, I would be happy 😄
Thank you for your help, @rasmus Because your ideas are not simple so I think I will need time to research them, thank you again!
@rasmus This framework is hard for me, can you show me how to send command with table value type.
@minhhungit Sure, its right here: https://github.com/eventflow/EventFlow/blob/develop/Source/EventFlow.MsSql/EventStores/MsSqlEventPersistence.cs#L139
If there's something specific that's cumbersome, let me know
No @rasmus , I mean how to create a EventFlow Command that can emit a List< events >
And in that emit method, it will use table value type to optimize speed
Actually, I will very happy if you can implement a new library to import events,
something like Eventflow.EventImporter
, I am not good man to implement it :flushed:
There's already such method on the IEventStore
that you can use. https://github.com/eventflow/EventFlow/blob/develop/Source/EventFlow/EventStores/IEventStore.cs#L35
There seems to be some misunderstanding. As I see, StoreAsync is method that used to store events for an aggregate ID
In my mind, I have a normal table called Supplier with 1 million records
I need to convert it to Supplier aggregate root with SupplierCreated
event
Each supplier record will have a SupplierRoot Id, so I need to create 1 million Supplier Aggregate Root and it just has an event SupplierCreated, not an aggregate with 1 million events That mean we will have 1 million supplier aggregate root with AggregateSequenceNumber = 1, like this:
GlobalSeqNum | AggregateId | AggregateName | Data | AggSeqNum |
---|---|---|---|---|
1 | Supplier-id-1 | SupplierRoot | {Name : "hello"} | 1 |
2 | Supplier-id-2 | SupplierRoot | {Name : "world"} | 1 |
3 | Supplier-id-3 | SupplierRoot | {Name : "and"} | 1 |
4 | Supplier-id-4 | SupplierRoot | {Name : "Eventflow"} | 1 |
The goal is we will use table param value to create 1 list all of above records and we just insert them into EventFlow table one times!
Anyway, an aggregate with 1 million events is an higher requirement, right now I don't need it but I will
Sorry for my English!
@minhhungit Then the easiest solution would be to mimic the code in the MSSQL event store found on line https://github.com/eventflow/EventFlow/blob/develop/Source/EventFlow.MsSql/EventStores/MsSqlEventPersistence.cs#L108
My focus right now is getting .NET Core support and cleaning up some of the population of read models code.
Hi i want to use EventFlow in production environment. I have 100000 events for one aggregate. But it's to slowly. Do you know a way to improve performance? I have implement snapshop but it is to slowly too. Is it better to use EventStore ?
@poumup I have been thinking about what to do and currently there isn't an easy solution. If you haven't already imported all events to EventFlow, you might want to create a custom mapper in a custom application that can take your existing events and map them to EventFlow JSON metadata format. The format should be fairly straightforward and then use MSSQL bulk inserts to insert several hundreds of events at a time.
If you don't want to load 100.000 events into memory and apply them to aggregates, then you'll have to create the snapshots manually 😢
I have been working on streaming events from event stores into aggregates, i.e., loading events in batches and applying them to aggregates to save memory consumption for aggregates with a lot of events, but I was hoping to be able to wait until C# 8 IAsyncEnumerable
came out as that would be a perfect fit. I was started on implementing a ObjectStream
<T>` for the event archive in PR #317 but again, using a standard type would be best.
If you have any alternative ideas, please post them.
@minhhungit Hi. I started to PoC importing a huge set of events. Its not ready yet but it might give you an idea. Have a look at #464 and the added MsSqlEventStoreTests.ImportEvents
test
@rasmus thank you for your attempt 💛 honestly I have not work with evenflow in a long time, I will review it asap.
Hello there!
We hope you are doing well. We noticed that this issue has not seen any activity in the past 90 days. We consider this issue to be stale and will be closing it within the next seven days.
If you still require assistance with this issue, please feel free to reopen it or create a new issue.
Thank you for your understanding and cooperation.
Best regards, EventFlow
Hello there!
This issue has been closed due to inactivity for seven days. If you believe this issue still needs attention, please feel free to open a new issue or comment on this one to request its reopening.
Thank you for your contribution to this repository.
Best regards, EventFlow