# Analysing simulation result of event sequences with branch

So I have a problem where a sequence of

A1 > B1 > C1 > D1

or

A1 > B1 > C2 > D2

or

A1 > B1 > C2 > D3

or

A2 > B2 > C3 > D4

Note there's more than 1 root starting point too. Each stage also has some other properties to it. So I'd want to ask

- find all stage (regardless of ABCD) where property 1 = some value and has some where up the parent chain property 2 = some value.
- I need to work out the probability of getting to each stage if given all "sequence branch" are of equal probability. So probability of getting to D3 is 1/2(A) * 1/2(C) where as D1 or D2 stage is 1/2(A) * 1/2(C) * 1/2(D)
- Conditional probability. Given B1 has happened, what's the chance of D3.

What's the best way / technique to store and analyse / query / interrogate data like this? What sort of keywords should I google / field / technology to read and learn?

Note I'm thinking to generate in the neighbourhood of 100s of k up to millions sample of sequence events.

I've had some look at RDBMS recursive CTE. That solves problem 1, but 2 and 3 in combination seem a bit more difficult. Was wondering if a graph database like neo4j can solve the problem better?