Do multiple replacings at once
Given the string
s = 'Money&Gram'
I want to replace & sign to "-" and " ", so that the output is
['Money-Gram', 'Money Gram']
I solved it with tuples and loop, but I think there must be more elegant solution.
t = [ ('&', '-'), ('&', ' ') ] for r,s in t: l = word.replace(r,s)
See also questions close to this topic
- Why is this an invalid syntax?
Compact Lists OR Faster lists?
Is there a way to create List/Lists-of-Lists (and may be dicts) that act as lists in python but take less memory space ?
Even if the access is slower for in memory structure.
Or the other way around faster but take more memory.
Using in-memory DBs like redis I suppose is slower and takes more memory!
One possible usage is NLP tasks and ML where we have to store big chunks of parsed text. Or features.
one way for words is to create lexer/dict and have integer-list, but it is still a python list and i suppose the meta info overhead will be bigger percentage wise.
In : sys.getsizeof(list(range(100))) Out: 1008 In : sys.getsizeof(array('i',range(100))) Out: 472 In : sys.getsizeof(list(range(1000))) Out: 9112 In : sys.getsizeof(array('i',range(1000))) Out: 4184
How can I decode the RSA encryption more efficiently?
For a project I'm decoding the RSA encryption. My code works perfectly, but the check I can do, says its too slow.
I've tested the algorithm and I've concluded that the bottleneck is in the following code:
message = (c**d) % n
Without this, the code runs instantaneously. c is the encrypted message, d is the Modular multiplicative inverse and n = pq. the encrypted message is 783103, so I get that I'm dealing with large numbers, but now it takes around 1 seconds to run. Is there any way to speed this up?
Split string to make df out of parts of the string
I have the following string
text <- "Species\n9.1.1 Dog A2002 AKITA CHOW The Akita Chow is a mixed\n breed. Large/independent,\n strong and loyal\n A2003 AMERICAN BULLDOG (BULLDOG) The american Bulldog is\n stocky and musical, but also\n agile and built for chasing\n animals\n9.1.2.Flying (or gliding) B101 BIG EARED BAT Townsend’s big-eared bat\nanimals9.1.2.Flying (or (Corynorhinus townsendii) is a\ngliding) animals species of vesper bat.\n"
I wish to obtain a df like:
Species Animal 1 9.1.1 Dog A2002 AKITA CHOW 2 9.1.1 Dog A2003 AMERICAN BULLDOG (BULLDOG) 3 9.1.2. Flying (or gliding) animals B101 BIG EARED BAT
The only thing that seems consistent/has no errors is the uppercase column (animal) for example A2002 AKITA CHOW, that's why I thought the most logical thing to do is to split everything before and after the uppercase part.
# search for something with space before it, and starting with capital letter followed by integers strsplit(text, "(?<=\\s)(?=[A-Z][0-9]+)", perl = TRUE)
Anybody have suggestions? Thanks in advance :)
Problems while creating boolean ocurrence list?
Given a fixed list of terms, and a string:
A = ['the', 'quick brown', 'fox', 'dog'] B = 'the jump, lazy "dog" quick brown.'
How can a create an list with the boolean ocurrences? For example for the above items, the expected result should look like:
[1, 0, 0, 1, 1]
The reason, is that on the one hand,
quick brownappear in list A. On the other hand,
lazydoesnt appear in
A. So far I tried to do it by transforming the items to numpy arrays and using numpy's
list(np.where(np.asarray(A) == np.asarray(B.split()), 1, 0))
However, it doesnt seem to work as I am getting:
ValueError Traceback (most recent call last) ----> 3 list(np.where(np.asarray(A) == np.asarray(B.split()), 1, 0)) ValueError: shape mismatch: objects cannot be broadcast to a single shape
Any idea of how to get the ocurrences in the list?
How to identify incremental patterns in a string in Python
I have a one-column data frame that contains randomly generated characters. I am hoping to write some code that can identify if any of the characters are following an incremental pattern of some sort. Example:
ebe120xg21 ebe121xg22 vpq17laos fvut10hals ebe122xg23
Some of this numbers are clearly incrementing e.g.
How would I efficiently identify such kind of incrementation? The tricky part is that this patterns can appear on any section of the string.
How to replace values in a specific part of character with a range of numbers
I can't seem to figure out how to replace a specific value of the characters in my vector. My vector is:
str(cryostick_1) chr [1:21] "4490015" "44900151" "44900151" "44900151" "44900151" "44900152" "44900152" "44900152" ...
The 4th until the 6th part of each of these values in the vector are "001". I need to change these all to 002 (until the end of the vector), 003, 004....until 137
Is there any way to do this with a for loop or lapply? At this moment when I even try to create a range from 002 until 137, it erases the first 2 zeros: 002 --> 2
Any help would be very much appreciated
Thank you in advance. This is how the output should look:
4490015 44900151 44900151 . . . 4490025 44900251 44900251 . . . 4490035
and so on until 137
(regex,sed) How to append erased word after erasing word
What I want to change
What I want to make
I want to pick up all dots ("."), and Want to append to end of the line.
If there is no dot (".") in the line, Then It just print the line.
I can erase dot in the line which have dots. But I can't append dot to end of the line under this condition (
$ echo /usr/local:.:/bin | sed -r -e "/\.:/s/(\.:)//g"
How can I append erased word after I erase word, only with the
R sweep a dataframe for characters, but only in the parameter columns
If I have a .csv that looks a bit like this (names and places have been changed to protect the innocent) and is read in as a dataframe df
Species Place param1 param2 param3 1 D.lice on head 123.123 39 65.43 2 X.elephant up butt 234.400 ***** 3 B.booger in nose 32.000 <NA> $%(*0 4 F.farts blame dog -9.990 43
How would I remove all character "cells" and replace them with an empty value "" (not NULL), leaving only numbers (and, importantly, columns that have num (or numerical) type, so that I can stop errors like this
Error in hist.default(testParam) : 'x' must be numericwhere
testParamis one of the columns?
I thought of
sweep, and have been trying various implementations of
replace, but I can't seem to get either of them to work where they only affect the parameter columns, and where they can pick up any possible character/strings that have been inserted by the various parameter generators.