Analysing simulation result of event sequences with branch
So I have a problem where a sequence of
A1 > B1 > C1 > D1
or
A1 > B1 > C2 > D2
or
A1 > B1 > C2 > D3
or
A2 > B2 > C3 > D4
Note there's more than 1 root starting point too. Each stage also has some other properties to it. So I'd want to ask
 find all stage (regardless of ABCD) where property 1 = some value and has some where up the parent chain property 2 = some value.
 I need to work out the probability of getting to each stage if given all "sequence branch" are of equal probability. So probability of getting to D3 is 1/2(A) * 1/2(C) where as D1 or D2 stage is 1/2(A) * 1/2(C) * 1/2(D)
 Conditional probability. Given B1 has happened, what's the chance of D3.
What's the best way / technique to store and analyse / query / interrogate data like this? What sort of keywords should I google / field / technology to read and learn?
Note I'm thinking to generate in the neighbourhood of 100s of k up to millions sample of sequence events.
I've had some look at RDBMS recursive CTE. That solves problem 1, but 2 and 3 in combination seem a bit more difficult. Was wondering if a graph database like neo4j can solve the problem better?
