Analysing simulation result of event sequences with branch
So I have a problem where a sequence of
A1 > B1 > C1 > D1
A1 > B1 > C2 > D2
A1 > B1 > C2 > D3
A2 > B2 > C3 > D4
Note there's more than 1 root starting point too. Each stage also has some other properties to it. So I'd want to ask
- find all stage (regardless of ABCD) where property 1 = some value and has some where up the parent chain property 2 = some value.
- I need to work out the probability of getting to each stage if given all "sequence branch" are of equal probability. So probability of getting to D3 is 1/2(A) * 1/2(C) where as D1 or D2 stage is 1/2(A) * 1/2(C) * 1/2(D)
- Conditional probability. Given B1 has happened, what's the chance of D3.
What's the best way / technique to store and analyse / query / interrogate data like this? What sort of keywords should I google / field / technology to read and learn?
Note I'm thinking to generate in the neighbourhood of 100s of k up to millions sample of sequence events.
I've had some look at RDBMS recursive CTE. That solves problem 1, but 2 and 3 in combination seem a bit more difficult. Was wondering if a graph database like neo4j can solve the problem better?
See also questions close to this topic
How does the numpy random normal function work?
np.random.normal(32000,200000,3650)? As far as I know the first value is the mean value, the second is the standard deviation and the last one is size. How can the deviation be
200000when the mean is just
And what is
np.random.seed(12345)? And why is it used?
Python datatype that includes uncertainty/error bars?
Is there a python datatype that includes numerical error bars?
: a = 3.00 ± 0.100 : b = 4.00 ± 0.100 : b + a >> 7.00 ± 0.141
√(0.1^2 + 0.1^2) = 0.141
I figured since imaginary numbers already exist in a form something like this
a= 3 + j4, maybe there is a module that handles error analysis for you as well. (I suppose it's complicated by the fact that the + & - uncertainties need not be equal.)
How to get stats value after CrawlerProcess finished, i.e. at line after process.start()
I am using this code somewhere inside spider:
So, when this exceptions raised, eventually my spider closing working and I get in console stats with this string:
But - how I can get it from code? Cause I want to run spider again in loop, based on info from this stats, something like this:
from scrapy.crawler import CrawlerProcess from scrapy.utils.project import get_project_settings import spaida.spiders.spaida_spider import spaida.settings you_need_to_rerun = True while you_need_to_rerun: process = CrawlerProcess(get_project_settings()) process.crawl(spaida.spiders.spaida_spider.SpaidaSpiderSpider) process.start(stop_after_crawl=False) # the script will block here until the crawling is finished finish_reason = 'and here I get somehow finish_reason from stats' # <- how?? if finish_reason == 'finished': print("everything ok, I don't need to rerun this") you_need_to_rerun = False
I found in docs this thing, but can't get it right, where is that "The stats can be accessed through the spider_stats attribute, which is a dict keyed by spider domain name.": https://doc.scrapy.org/en/latest/topics/stats.html#scrapy.statscollectors.MemoryStatsCollector.spider_stats
P.S.: I'm also getting error twisted.internet.error.ReactorNotRestartable when using
process.start(), and recommendations to use
process.start(stop_after_crawl=False)- and then spider just stops and do nothing, but this is another problem...
Bouncing ball python with turtle graphics
So I have been working on my code, where I would like to simulate particles of different gasses interacting. This is as far as I got - created the balls and register bounces off of walls and other balls. How would I now go about giving each "color" or type of gas properties, so that it would maybe stick together as a binary molecule etc.?
import turtle import random import time from Classes import * # Sets up the screen and everything wn = turtle.Screen() wn.tracer(0) wn.bgcolor("white") wn.canvheight = 600 wn.canvwidth = 600 wn.title("Nogravity") xlimit, ylimit = wn.canvwidth / 2, wn.canvheight / 2 #-------------------------------------------------------------------------------------- # For drawing the bounds of the square board = turtle.Turtle() board.speed(0) rectangle(board, -wn.canvwidth/2, -wn.canvheight/2, wn.canvwidth, 5, "black") rectangle(board, -wn.canvwidth/2, -wn.canvheight/2, 5, wn.canvheight, "black") rectangle(board, -wn.canvwidth/2, wn.canvheight/2, wn.canvwidth, 5, "black") rectangle(board, wn.canvwidth/2 - 5, -wn.canvheight/2, 5, wn.canvheight, "black") board.ht() #-------------------------------------------------------------------------------------- # Ball setup balls =  colors = ["blue","red","lightblue","black"] numberBalls = 10 for _ in range(numberBalls): turt = turtle.Turtle() balls.append(turt) for ball in balls: ball.penup() ball.shape("circle") ball.color(random.choice(colors)) ball.speed(0) x = random.randint(-wn.canvwidth / 2 + 10, wn.canvwidth / 2 - 10) y = random.randint(-wn.canvheight / 2 + 10, wn.canvheight / 2 - 10) ball.goto(x, y) ball.dy = random.randint(-2, 2) if ball.dy == 0: ball.dy += 1 ball.dx = random.randint(-2, 2) if ball.dx == 0: ball.dx += 1 ballimit = 20 ballcount = 0 #-------------------------------------------------------------------------------------- # Main game loop while True: wn.update() for ball in balls: #Premikanje kroglice - dx je sprememba x osi, dy je sprememba y osi ball.sety(ball.ycor() + ball.dy) ball.setx(ball.xcor() + ball.dx) #Izračuna debelino kroglice ballXrangeHI = ball.xcor() + ballimit ballXrangeLO = ball.xcor() - ballimit ballYrangeHI = ball.ycor() + ballimit ballYrangeLO = ball.ycor() - ballimit if ballXrangeHI > xlimit or ballXrangeLO > xlimit: ball.dx *= -1 if ballXrangeHI < -xlimit or ballXrangeLO < -xlimit: ball.dx *= -1 if ballYrangeHI > ylimit or ballYrangeLO > ylimit: ball.dy *= -1 if ballYrangeHI < -ylimit or ballYrangeLO < -ylimit: ball.dy *= -1 # Bounce test for thing in balls: if balls[ballcount] != thing and proximity(ball.xcor(), ball.ycor(), thing.xcor(), thing.ycor()) <= ballimit: tempx = ball.dx tempy = ball.dy ball.dx = thing.dx ball.dy = thing.dy thing.dx = tempx thing.dy = tempy ball.dx ball.dy thing.dx thing.dy ballcount += 1 if ballcount == numberBalls: ballcount = 0 #time.sleep(0.0002) wn.mainloop()
Generating CDF Graphs using Seaborn
I am trying to plot a CDF graph for my code using Seaborn but can't get it to work.
Specifically, I want to generate CDF graphs for sum_MDA, sum_CLA, sum_BIA and grand_total after I have simulated the entire code 1000 times. My code is as follows (apologies in advance for the length).
def sim(): df['RAND'] = np.random.uniform(0,1, size=df.index.size) dfRAND = list(df['RAND']) def L(): result =  conditions = [df.RAND >= (1 - 0.8062), (df.RAND < (1 - 0.8062)) & (df.RAND >= 0.1), (df.RAND < 0.1) & (df.RAND >= 0.05), (df.RAND < 0.05) & (df.RAND >= 0.025), (df.RAND < 0.025) & (df.RAND >= 0.0125), (df.RAND < 0.0125)] choices = ['L0', 'L1', 'L2', 'L3', 'L4', 'L5'] df['L'] = np.select(conditions, choices) result = df['L'].values return result L() #print(L()) #print(df.pivot_table(index='L', aggfunc=len, fill_value=0)) def MD(): result =  conditions = [L() == 'L0', L() == 'L1', L() == 'L2', L() == 'L3', L() == 'L4', L() == 'L5'] choices = [(df['P_MD'].apply(lambda x: x * 0.02)), (df['P_MD'].apply(lambda x: x * 0.15)), (df['P_MD'].apply(lambda x: x * 0.20)), (df['P_MD'].apply(lambda x: x * 0.50)), (df['P_MD'].apply(lambda x: x * 1.0)), (df['P_MD'].apply(lambda x: x * 1.0))] df['MDL'] = np.select(conditions, choices) #result = print(df['MDL'].values) return result MD() def CL(): result =  conditions = [L() == 'L0', L() == 'L1', L() == 'L2', L() == 'L3', L() == 'L4', L() == 'L5'] choices = [1600, 3200, 9600, 48000, 48000, 48000] df['CL'] = np.select(conditions, choices) #result = print(df['CL'].values) return result CL() def BI(): result =  conditions = [L() == 'L0', L() == 'L1', L() == 'L2', L() == 'L3', L() == 'L4', L() == 'L5'] choices = [(df['P_BI'].apply(lambda x: (x / 548) * 1)), (df['P_BI'].apply(lambda x: (x / 548) * 2)), (df['P_BI'].apply(lambda x: (x / 548) * 14)), (df['P_BI'].apply(lambda x: (x / 548) * 60)), (df['P_BI'].apply(lambda x: (x / 548) * 180)), (df['P_BI'].apply(lambda x: (x / 548) * 365))] df['BIL'] = np.select(conditions, choices) #result = print(df['BIL'].values) return result BI() sum_MDA = int(np.sum(df['MDL'])) sum_CLA = int(np.sum(df['CL'])) sum_BIA = int(np.sum(df['BIL'])) grand_total = int(sum_MDA + sum_CLA + sum_BIA) result = sum_MDA, sum_CLA, sum_BIA, grand_total return result sim() for i in range(1000): print(sim()) #sns.distplot(sim(), bins=100, #kde_kws=dict(cumulative=True), axlabel='(£)', color='purple', #).set_title('Simulation (N=1000)')
Any help is appreciated. Thanks a lot.
Any online way to learn writing a device driver of micro controller in C from scratch?
Is there any way online without getting hold of an actual micro controller where I can learn to write a device driver (e.g bluetooth, usb) in C from scratch and see how hardware works? I know basics of C and I work in windows environment.
Implementation of Promethee in java
I am currently looking to implement PROMETHEE for website ranking in Java. I am looking to supply several website factors for multiple websites, The MCDA method is expected to compute the credibility score for each website and return an ordered list with each website's credibility score. I am currently stuck at the implementation of PROMETHEE in java. I have tried to search for a java library but I am hitting dead ends. I would also appreciate suggestions for alternative libraries.
token is either red, green or blue
Suppose we have a lot of tokens, and every token is either red, green or blue. We also have a bag with some tokens in it.
Let’s consider the following procedure:
Repeat the following until the bag is empty
(1)If there are more than two tokens in the bag, take two random tokens out of the bag.Otherwise, empty the bag.
(2) According the two tokens we got in step (1), we do the following things:
• Case 1: If one of the tokens is red, do nothing. • Case 2: If both tokens are green, we put one green token and 2 blue tokens back into the bag. • Case 3: If we got one blue token, and the other token is not red, then we put 3 red tokens back into the bag.
Assume that we always have enough tokens to put back into the bag, prove via induction that this process always terminates.
For problems that require you to provide an algorithm, you must give the following (unless the problems explicitly mention that you don’t need to):
a precise description of the algorithm in English and, if helpful, pseudocode,
a proof of correctness,
an analysis of running time and space.
Remove stopword in csv file with regex python
I Have code with python 2 :
import re ... def processTweet(tweet): ... tweet = re.sub('@[^\s]+',' ',tweet) tweet = tweet.lower() ... return tweet data = pd.read_csv('data/dataset1.csv', quotechar='|', encoding='latin-1') text= data.iloc[:,1] label=data.iloc[:,0] labelz=Series.to_frame(label) text = text.apply(processTweet) textz=Series.to_frame(text) dataz=textz.join(labelz) print(dataz) dataz.to_csv ("data/DataPreprocessing.csv",sep=',',encoding='utf8',index=False,header=False)
Code above working fine, but how to add stopword like this code :
# Import Stopword Factory class from Sastrawi.StopWordRemover.StopWordRemoverFactory import StopWordRemoverFactory s = 'String text sample...' stop_factory = StopWordRemoverFactory() more_stopword = ['dengan', 'ia','bahwa','oleh'] stopword = stop_factory.create_stop_word_remover() print(stopword.remove(s))
Code above is for string 's' how about using pandas or etc to process csv file, can we do with regex like the first code ?
Thank you, sorry I'm new in python