How to find bigger than 2 Tenure number -on cvs flie in python
I am working on cvs file with Pandas.
import pandas as pd
df = pd.read_csv('/gdrive/MyDrive/sublist.csv', index_col='Username')
print(df)
#Print all db
df.sample()
#Print random row
With this code i can easly print my database cvs file. And select random row in my cvs file. But I want to find "Tenure" number which are bigger than 2.
from pandas import DataFrame
df = DataFrame('/gdrive/MyDrive/sublist.csv', columns = ['Username', 'Tenure'])
print("Original data frame:\n")
print(df)
# Selecting the product of Price greater
# than or equal to 25000
select_prod = df.loc[df['Tenure'] >= 1]
print("\n")
# Print selected rows based on the condition
print("Selecting rows:\n")
print (select_prod)
I did something wrong but i can't figure out. Pls help me. I used different library but i cant find best one.
See also questions close to this topic
-
Capture live output of a subprocess whithout blocking it
I am currently writing a script that will need at some point to capture the output of a heavy
grep
command.The problem is, I can't obtain decent performance from the command compared to just running in the shell. (It appears to be at least 2x slower, sometimes never ending). I'm grepping a whole partition (that's part of the purpose of the script), I know it's a slow operation, I'm talking about the huge difference between runtime in the shell and in my python script.
I've struggled with it for quite some time. Tried the queue library, threading, multiprocessing, gave asyncio a bit of a shot. I'm getting really lost.
I've shortened it to simplest form, here it is :
from subprocess import PIPE, Popen p = Popen(['grep', '-a', '-b', 'MyTestString', '/dev/sda1'], stdin = None, stdout = PIPE, stderr = PIPE, bufsize=-1) while True: output = p.stdout.readline() if output: print(output.strip())
So here, my
grep
command is way slower than in the shell. I've tried putting atime.sleep
in my main loop but it seems to be worse.Just a few more infos :
- There will be very few output from the command.
- The final goal would be to grab the output without blocking the main thread but one problem at time.
- Again, I know that my grep command is a heavy one
Thanks for your help if you have any idea or suggestion. I'm on the verge of despair :(
-
Invalid base64-encoded string: number of data characters (217) cannot be 1 more than a multiple of 4 error in python django
You see I am getting this incredibly weird error related to the no of characters in a string I guess. I am a bit new to django and I have never see such an weird error. I checked a few threads and according to that this is about encoding and decoding something which I really do not know how to resolve. Here is my code:
Request Method: GET Request URL: http://127.0.0.1:8000/profiles/myprofile/ Django Version: 2.1.5 Python Version: 3.9.0 Installed Applications: ['django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'posts', 'profiles'] Installed Middleware: ['django.middleware.security.SecurityMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.csrf.CsrfViewMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.contrib.messages.middleware.MessageMiddleware', 'django.middleware.clickjacking.XFrameOptionsMiddleware'] Traceback: File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\sessions\backends\base.py" in _get_session 190. return self._session_cache During handling of the above exception ('SessionStore' object has no attribute '_session_cache'), another exception occurred: File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\core\handlers\exception.py" in inner 34. response = get_response(request) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\core\handlers\base.py" in _get_response 126. response = self.process_exception_by_middleware(e, request) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\core\handlers\base.py" in _get_response 124. response = wrapped_callback(request, *callback_args, **callback_kwargs) File "D:\PROJECTS\social\src\profiles\views.py" in my_profile_view 6. profile = Profile.objects.get(user=request.user) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\manager.py" in manager_method 82. return getattr(self.get_queryset(), name)(*args, **kwargs) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\query.py" in get 390. clone = self.filter(*args, **kwargs) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\query.py" in filter 844. return self._filter_or_exclude(False, *args, **kwargs) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\query.py" in _filter_or_exclude 862. clone.query.add_q(Q(*args, **kwargs)) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\sql\query.py" in add_q 1263. clause, _ = self._add_q(q_object, self.used_aliases) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\sql\query.py" in _add_q 1284. child_clause, needed_inner = self.build_filter( File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\sql\query.py" in build_filter 1176. value = self.resolve_lookup_value(value, can_reuse, allow_joins) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\db\models\sql\query.py" in resolve_lookup_value 1009. if hasattr(value, 'resolve_expression'): File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\utils\functional.py" in inner 213. self._setup() File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\utils\functional.py" in _setup 347. self._wrapped = self._setupfunc() File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\auth\middleware.py" in <lambda> 24. request.user = SimpleLazyObject(lambda: get_user(request)) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\auth\middleware.py" in get_user 12. request._cached_user = auth.get_user(request) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\auth\__init__.py" in get_user 182. user_id = _get_user_session_key(request) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\auth\__init__.py" in _get_user_session_key 59. return get_user_model()._meta.pk.to_python(request.session[SESSION_KEY]) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\sessions\backends\base.py" in __getitem__ 55. return self._session[key] File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\sessions\backends\base.py" in _get_session 195. self._session_cache = self.load() File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\sessions\backends\db.py" in load 44. return self.decode(s.session_data) if s else {} File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\site-packages\django\contrib\sessions\backends\base.py" in decode 101. encoded_data = base64.b64decode(force_bytes(session_data)) File "C:\Users\aarti\AppData\Local\Programs\Python\Python39\lib\base64.py" in b64decode 87. return binascii.a2b_base64(s) Exception Type: Error at /profiles/myprofile/ Exception Value: Invalid base64-encoded string: number of data characters (217) cannot be 1 more than a multiple of 4 ```
-
Django register user
Hi i want to connect normal user using UserCreationForm and my own Creation when i post it with button my postgre auth_user and auth_users_register nthing add to database when i click on button my code:
forms.py
class RegisterForm(ModelForm): class Meta: model = Register fields = ['date_of_birth', 'image_add'] widgets = { 'date_of_birth': DateInput(attrs={'type': 'date'}) } class CreateUserForm(UserCreationForm): class Meta: model = User fields = ['username', 'email', 'password1', 'password2']
models.py
def validate_image(image_add): max_height = 250 max_width = 250 if 250 < max_width or 250 < max_height: raise ValidationError("Height or Width is larger than what is allowed") class Register(models.Model): user = models.OneToOneField( User, on_delete=models.CASCADE) date_of_birth = models.DateField( max_length=8, verbose_name="date of birth") image_add = models.ImageField( upload_to="avatars", verbose_name="avatar", validators=[validate_image])
views.py
class RegisterPageView(View): def get(self, request): if request.user.is_authenticated: return redirect('/') user_form = CreateUserForm(request.POST) register_form = RegisterForm(request.POST) return render(request, 'register.html', {'user_form': user_form, 'register_form': register_form}) def post(self, request): if request.method == 'POST': user_form = CreateUserForm(request.POST) register_form = RegisterForm(request.POST) if user_form.is_valid() and register_form.is_valid(): user_form.save() register_form.save(commit=False) user = user_form.cleaned_data.get('username') messages.success( request, 'Your account has been registered' + user) return redirect('login') user_form = CreateUserForm() register_form = RegisterForm() context = {'user_form': user_form, 'register_form': register_form} return render(request, 'register.html', context)
-
How to calculate the body size of candle for OHLC for comparison
Need help in understanding how to calculate candle body from OHLC and would like to make the following classifications from the OHLC.
STRONG BUY: if next candle's body is outside last candles body AND next candle body > 2x times larger than last candle body AND next candle body ABOVE last candle body
BUY: if next candle's body is outside last candles body AND next candle body > last candle body AND next candle body ABOVE last candle body
STRONG SELL: if next candle's body is outside last candles body AND next candle body > 2 times larger than last candle body AND next candle body BELOW last candle body
SELL: if next candle's body is outside last candles body AND next candle body > last candle body AND next candle body BELOW last candle body
Rest is neutral
Below is the code logic we have come up with
classification_body_size = ABS(class_open - class_high) //determines size of classification last_body_size = ABS(last_open - last_close) // determines size of last candle in the graph classification = "Neutral" if classification_open >= last_close AND classification_close > last_close if classification_body_size > 2 x last_body_size classification = "STRONG_BUY" else classification = "BUY" endif endif if classification_open <= last_close AND classification_close < last_close if classification_body_size > 2 x last_body_size classification = "STRONG_SELL" else classification = "SELL" endif endif
sample dataset- Dates PX_OPEN PX_HIGH PX_LOW PX_LAST Predict PX_LAST_SHIFTED 50 2000-03-15 136.875 140.4375 136.0625 139.8125 146.3438 146.3438 51 2000-03-16 141.625 146.8438 140.875 146.3438 146.9375 146.9375 52 2000-03-17 145.8125 148.0 145.4375 146.9375 146.1875 146.1875 53 2000-03-20 146.875 147.3438 144.7813 146.1875 149.1875 149.1875 54 2000-03-21 145.5313 149.75 144.5 149.1875 150.0938 150.0938 55 2000-03-22 149.5625 150.8438 148.6875 150.0938 152.6563 152.6563 56 2000-03-23 149.1563 153.4688 149.1563 152.6563 153.5625 153.5625 57 2000-03-24 152.875 155.75 151.7188 153.5625 151.9375 151.9375 58 2000-03-27 153.375 153.7813 151.8125 151.9375 151.0625 151.0625 59 2000-03-28 151.25 152.9844 150.5938 151.0625 151.2188 151.2188 60 2000-03-29 151.5625 152.4844 149.6563 151.2188 148.6875 148.6875 61 2000-03-30 150.1563 151.9219 147.125 148.6875 150.375 150.375 62 2000-03-31 149.625 152.3125 148.4375 150.375 151.25 151.25 63 2000-04-03 150.125 151.25 148.6875 151.25 145.75 145.75 64 2000-04-04 151.75 153.0 145.75 145.75 149.1875 149.1875 65 2000-04-05 147.875 150.8125 147.625 149.1875 150.4844 150.4844 66 2000-04-06 150.25 151.6875 149.0 150.4844 151.4375 151.4375 67 2000-04-07 151.5625 152.125 150.5 151.4375 150.8438 150.8438 68 2000-04-10 151.75 153.1094 150.3125 150.8438 150.4063 150.4063 69 2000-04-11 150.0 151.625 148.375 150.4063 146.2813 146.2813 70 2000-04-12 150.375 151.1563 146.1563 146.2813 144.25 144.25 71 2000-04-13 147.4688 148.1563 141.125 144.25 136.0 136.0 72 2000-04-14 142.625 142.8125 133.5 136.0 140.75 140.75 73 2000-04-17 135.1875 140.75 134.6875 140.75 144.4688 144.4688 74 2000-04-18 140.5625 144.4688 139.7813 144.4688 143.125 143.125 75 2000-04-19 144.5 145.125 142.5313 143.125 143.8125 143.8125 76 2000-04-20 143.5625 143.9375 142.375 143.8125 142.25 142.25 77 2000-04-24 141.5 143.3125 140.5 142.25 148.1563 148.1563 78 2000-04-25 144.625 148.1563 144.4375 148.1563 146.4844 146.4844 79 2000-04-26 147.9688 148.75 146.0 146.4844 146.0 146.0 80 2000-04-27 143.0 147.3438 143.0 146.0 145.0938 145.0938 81 2000-04-28 147.0 147.8594 145.0625 145.0938 147.0625 147.0625 82 2000-05-01 146.5625 148.4844 145.8438 147.0625 144.125 144.125 83 2000-05-02 145.5 147.125 144.125 144.125 140.75 140.75 84 2000-05-03 144.0 144.0 139.7813 140.75 141.8125 141.8125 85 2000-05-04 142.0 142.3594 140.75 141.8125 143.5313 143.5313 86 2000-05-05 141.0625 144.0 140.9375 143.5313 142.4531 142.4531 87 2000-05-08 142.75 143.375 141.8438 142.4531 141.3125 141.3125 88 2000-05-09 143.0625 143.4063 140.2656 141.3125 138.125 138.125 89 2000-05-10 140.5 140.9688 137.75 138.125 141.2813 141.2813 90 2000-05-11 140.125 141.5 139.125 141.2813 142.8125 142.8125 91 2000-05-12 141.8125 143.4688 141.625 142.8125 145.2813 145.2813 92 2000-05-15 142.75 145.6094 142.0 145.2813 146.6875 146.6875 93 2000-05-16 146.5625 147.7188 145.3125 146.6875 145.1563 145.1563 94 2000-05-17 145.6875 146.1875 144.4688 145.1563 143.375 143.375 95 2000-05-18 145.625 146.3125 143.375 143.375 141.125 141.125 96 2000-05-19 142.5625 143.2344 140.4063 141.125 140.0625 140.0625 97 2000-05-22 141.25 141.4688 137.0 140.0625 138.0 138.0 98 2000-05-23 140.4375 140.8125 137.5625 138.0 140.25 140.25 99 2000-05-24 138.0 140.6875 136.5 140.25 137.8438 137.8438
-
Group list inside dict using multiple keys in python3.6
Here I'm given input JSON data. Anyone help me to solve expected result below format. I'm trying to solve but I can't get expected result format.
sub_type_list = [ { 'name':'blood', 'transaction_points':[ { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id': "423" }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'1' } ] }, { 'name':'blood', 'transaction_points':[ { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'123' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'' } ] }, { 'name':'body', 'transaction_points':[ { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'42' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'11' } ] }, { 'name':'blood', 'transaction_points':[ { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'87' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'50' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'25' } ] } ]
Expected Result below
[{ 'name': 'blood', 'transaction_points':[ { [ { 'point': '(1-10)', 'value': '', 'symbol': '', 'service_id': '' }, { 'point': '(1-10)', 'value': '', 'symbol': '', 'service_id': '123' }, { 'point': '(1-10)', 'value': '', 'symbol': '', 'service_id': '87' } ] }, { [ { 'point': '(10-20)', 'value': '', 'symbol': '', 'service_id': '423' }, { 'point': '(10-20)', 'value': '', 'symbol': '', 'service_id': '' }, { 'point': '(10-20)', 'value': '', 'symbol': '', 'service_id': '50' } ] },{ [ { 'point': '(20-30)', 'value': '', 'symbol': '', 'service_id': '1' }, { 'point': '(20-30)', 'value': '', 'symbol': '', 'service_id': '' }, { 'point': '(20-30)', 'value': '', 'symbol': '', 'service_id': '25' } ] } ] },{ 'name': 'body', 'transaction_points':[ { [ { 'point': '(1-10)', 'value': '', 'symbol': '', 'service_id': '' } ] }, { [ { 'point': '(10-20)', 'value': '', 'symbol': '', 'service_id': '42' } ] },{ [ { 'point': '(20-30)', 'value': '', 'symbol': '', 'service_id': '11' } ] } ] }]
Here I'm using given below code and try to generate expected result
result = dict() final_result = [] for id, i in enumerate(sub_type_list): if result.get(i["name"]): result[i["name"]].extend(i["transaction_points"]) else: result[i["name"]] = i["transaction_points"] for i in result.keys(): final_result.append({"name": i, "transaction_points": result[i]})
final_result list produce give below result but I want expected result above
[ { 'name':'blood', 'transaction_points':[ { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'423' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'1' }, { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'123' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'87' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'50' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'25' } ] }, { 'name':'body', 'transaction_points':[ { 'point':'(1-10)', 'value':'', 'symbol':'', 'service_id':'' }, { 'point':'(10-20)', 'value':'', 'symbol':'', 'service_id':'42' }, { 'point':'(20-30)', 'value':'', 'symbol':'', 'service_id':'11' } ] } ]
-
Cleaner version of stacking and merging in pandas?
Here is an toy-example of transforming
df
intodf_f
(basically it is kind of wide to long transform).I wonder if it could be done in more cleaner, concise and direct way?
import pandas as pd df = pd.DataFrame({'x': ['a', 'b', 'c'], 'A': [1,2,3], 'B':[7,8,9]}) x A B 0 a 1 7 1 b 2 8 2 c 3 9 cols_to_stack = ['A', 'B'] df_s = df[cols_to_stack].stack().rename_axis(('idx', 'c')).reset_index(name='value') idx c value 0 0 A 1 1 0 B 7 2 1 A 2 3 1 B 8 4 2 A 3 5 2 B 9 df_ind = df[['x']] #there could be more columns than x, but we can assume that will be df.columns "minus" col_to_stack df_f = pd.merge(df_s, df_ind, how='left', left_on='idx', right_index=True).drop(columns=['idx']) c value x 0 A 1 a 1 B 7 a 2 A 2 b 3 B 8 b 4 A 3 c 5 B 9 c
-
how to fill data frame's null value (empty) cells with the mean
before replacing null values with mean
here it show 313 null or empty cells in MINIMUM_PAYMENTS.
creditcard_df.loc[(creditcard_df['MINIMUM_PAYMENTS'].isnull()==True), 'MINIMUM_PATMENTS'] = creditcard_df['MINIMUM_PAYMENTS'].mean()
I have used the above code to fill all the null cells with the mean, but after executing the code I still get the same result. after replacing null values with mean
-
case when age groups in Pyspark
I have the following error from Pyspark:
py4j.Py4JException: Method and([class java.lang.Integer]) does not exist
Here is the code used:
#select columns from dimensonal customers customer = (spark.table(f'nn_squad7_{country}.dim_customers') .select(f.col('customer_id').alias('customers'), f.col('gender').alias('gender'), f.col('age').alias('age'),) .withColumn('age_group', f.when(f.col('age')>18 & f.col('age')<25,'18-25') .when(f.col('age')>26 & f.col('age')<35,'26-35') .when(f.col('age')>36 & f.col('age')<50,'36-50') .when(f.col('age')>51 & f.col('age')<60,'51-60') .when(f.col('age')>60,'+60').otherwise('undefined') ) )
I'm not sure where the mistake is as for me the error doesn't have sense. Any clue?
Thanks!
-
all features must be in [0, 9] or [-10, 0]
I have the following code:
df = load_data() pd.set_option('display.max_columns', None) df.dtypes intBillID object chBillChargeCode object chBillNo object chOriginalBillNo object sdBillDate datetime64[ns] sdDueDate datetime64[ns] sdDatePaidCancelled datetime64[ns] sdBillCancelledDate object totalDaysToPay int64 paidInDays int64 paidOnTime int64 chBillStatus object chBillType object chDebtorCode object chBillGroupCode int64 dcTotFeeBilledAmt float64 dcFinalBillExpAmt float64 dcTotProgBillAmt float64 dcTotProgBillExpAmt float64 dcReceiveBillAmt float64 dcTotWipHours float64 dcTotWipTargetAmt float64 vcReason object OperatingUnit object BusinessUnit object LosCode object dcTotNetBillAmt float64 dtype: object
Then I have this:
# Separate features and labels X, y = df[['totalDaysToPay', 'paidOnTime','dcTotFeeBilledAmt','dcFinalBillExpAmt','dcTotProgBillAmt', 'dcTotProgBillExpAmt','dcTotProgBillExpAmt','dcReceiveBillAmt','dcTotWipHours','dcTotWipTargetAmt']].values, df['paidInDays'].values print('Features:',X[:10], '\nLabels:', y[:10], sep='\n')
then I split X,Y
from sklearn.model_selection import train_test_split
# Split data 70%-30% into training set and test set X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0) print ('Training Set: %d rows\nTest Set: %d rows' % (X_train.shape[0], X_test.shape[0]))
Then I want to transform numeric and categorial features:
# Train the model from sklearn.compose import ColumnTransformer from sklearn.pipeline import Pipeline from sklearn.impute import SimpleImputer from sklearn.preprocessing import StandardScaler, OneHotEncoder from sklearn.linear_model import LinearRegression import numpy as np from sklearn.ensemble import GradientBoostingRegressor # Define preprocessing for numeric columns (scale them) numeric_features = [8,9,10,11,12,13,15,16,17,18,19,20,21,26] numeric_transformer = Pipeline(steps=[ ('scaler', StandardScaler())]) # Define preprocessing for categorical features (encode them) categorical_features = [1,23,24,25] categorical_transformer = Pipeline(steps=[ ('onehot', OneHotEncoder(handle_unknown='ignore'))]) # Combine preprocessing steps preprocessor = ColumnTransformer( transformers=[ ('num', numeric_transformer, numeric_features), ('cat', categorical_transformer, categorical_features)]) # Create preprocessing and training pipeline pipeline = Pipeline(steps=[('preprocessor', preprocessor), ('regressor', GradientBoostingRegressor())]) # fit the pipeline to train a linear regression model on the training set model = pipeline.fit(X_train, (y_train)) print (model)
However I get this error:
ValueError: all features must be in [0, 9] or [-10, 0]