Multistep data pipeline with Kafka?

I'm evaluating Kafka as a potential tool for performing ETL jobs at my company. We have an existing workflow that goes something like this:

  1. Pull all CSVs in a specific AWS S3 directory, parse those files line by lines and insert them into a database.

  2. Once the entire directory has been processed, another job pulls a distinct list of IDs from that database and begins a machine learning analysis step for each ID.

If I am looking to replace this current process, would Kafka be capable of the above, and if so how should I set it up? From what I have read, my first thought was:

To replace the first step, create a Kafka producer that reads every file in the S3 directory and sends each line to a stream (would I call this the S3 topic?). I would then create a Kafka consumer that would take each record from this stream and insert it into the database (lets call it the device db).

To replace the second, I would create another producer that would fire once the first producer/consumer set had completed. This producer would gather a distinct set of ids from the table and send them to a second stream (the Analysis topic?). Then as each record came into this stream a second consumer would pull the ID and perform the needed analysis.

From what I have read, this was my first though on how to tackle the problem, but I still have some questions that I'm hoping someone who has more experience can answer.

  1. Does the Kafka setup I described seem like the proper way to use the tool, or are there improvements I should consider?

  2. Is inserting records into the device DB that I mentioned even necessary or would it be possible to store all of those records in a stream? This would essentially merge the two steps and eliminate the consumer from step 1 and the producer from step 2.