JSON decode error while trying to use multiprocessing with tqdm
I'm trying to run the following code:
iqs_pool = Pool(processes=30)
lst = []
for obj in tqdm(iqs_pool.imap(f, df["col"]), total=len(df)):
lst.append(obj)
iqs_pool.close()
iqs_pool.join()
iqs_pool.terminate()
Basically trying to apply the function f
to the col
column of df
. The function f
contains a line that runs the .json()
method on a requests
object returned via the requests.request
function.
When I run this, I get the following exception:
JSONDecodeError Traceback (most recent call last)
<ipython-input-20-fc66deeabd09> in <module>
2 lst = []
3
----> 4 for obj in tqdm(iqs_pool.imap(f, df["col"]), total=len(df)):
5 lst.append(obj)
6
~/anaconda3/lib/python3.7/site-packages/tqdm/_tqdm_notebook.py in __iter__(self, *args, **kwargs)
221 def __iter__(self, *args, **kwargs):
222 try:
--> 223 for obj in super(tqdm_notebook, self).__iter__(*args, **kwargs):
224 # return super(tqdm...) will not catch exception
225 yield obj
~/anaconda3/lib/python3.7/site-packages/tqdm/_tqdm.py in __iter__(self)
1015 """), fp_write=getattr(self.fp, 'write', sys.stderr.write))
1016
-> 1017 for obj in iterable:
1018 yield obj
1019 # Update and possibly print the progressbar.
~/anaconda3/lib/python3.7/multiprocessing/pool.py in next(self, timeout)
746 if success:
747 return value
--> 748 raise value
749
750 __next__ = next # XXX
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I then ran f
sequentially on df
rows:
lst = []
j = 0
for i in range(len(df)):
j = i
lst.append(f(df["col"].loc[i]))
So if the loop is terminated because of an error, I'll know which row of df
is giving problems. I ran it four times, got the JSON Decode error all 4 times and got values of j
as 11
, 0
, 0
, 19
. This makes no sense to me and I'm completely confused. If 0
was the problematic index, j
should've taken that value all four times. If 19
was the first problematic index, then I shouldn't get any prior values of j
at all.
So I tried one last thing:
lst = []
j = 0
for i in range(20):
j = i
lst.append(f(df["col"].loc[0]))
Note that I'm just running f
on the zeroth row of df
20 times. The loop terminates with a JSON Decode error after the tenth iteration (j=10
).
What is going on here?
See also questions close to this topic
-
Python3: Use Dateutil to make multiple vars from a long string
I have a program where I would like to randomly pull a line from a song, and string them together with other lines from other songs. Looking into I saw that the dateutil library might be able to help me parse multiple variables from a string, but it doesn't do quite what I want.
I have multiple strings like this, only much longer.
"This is the day\nOf the expanding man\nThat shape is my shade\nThere where I used to stand\nIt seems like only yesterday\nI gazed through the glass\n..."
I want to randomly pull one line from this string (To the page break) and save it as a variable but iterate this over multiple strings, any help would be much appreciated.
-
How can I write Python selenium test for PrimeFaces?
I am trying to write a Selenium test but the issue is I have learned that the page is generated with PrimeFaces, thus the element IDs randomly change from time to time. Not using IDs is not very reliable. Is there anything I can do?
-
Exception when converting metagraph to .pb
I was trying to convert my metagraph to .pb, here is code which I was using to do it:
import argparse import math import sys import cv2 import os import datetime import logging, logging.handlers import logging.config import tensorflow as tf import numpy as np from glob import glob from tqdm import tqdm os.environ["CUDA_VISIBLE_DEVICES"]='0' meta_path = 'C:\\Users\\dmura\\Desktop\\ToPb\\final.meta' # Your .meta file #output_node_names = ['output:0'] # Output nodes device_name = '/device:GPU:0' with tf.device(device_name): config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=False) config.gpu_options.allow_growth = True config.gpu_options.per_process_gpu_memory_fraction = 0.9 with tf.Session() as sess: # Restore the graph saver = tf.train.import_meta_graph(meta_path) # Load weights saver.restore(sess,tf.train.latest_checkpoint('C:\\Users\\dmura\\Desktop\\ToPb\\final')) # Freeze the graph output_node_names = [n.name for n in tf.get_default_graph().as_graph_def().node] frozen_graph_def = tf.graph_util.convert_variables_to_constants( sess, sess.graph_def, output_node_names) # Save the frozen graph with open('output_graph.pb', 'wb') as f: f.write(frozen_graph_def.SerializeToString())
when I was trying to save frozen_graph_def I got following exception:
Exception has occured: ValueError Message tensorflow.GrahDef exceeds maximum protobuf size of 2GB: 2255900840 File "C:\Users\VIDesktop\pbConverter\converter.py", line 46, in <module> f.write(frozen_graph_def.SerializeToString())
Have you got idea how can I solve this prolem?
-
Filter specific timed json objects from a json objects array
I am using an angular service to hit API and got a big json object array like
I want to separate objects that were created
today
,lastWeek
andlastMonth
provided that i only call the service only once and filter out the data oflastMoonth
like thisService
getAllParcels(start: any): Observable<IParcel[]> { return this._httpClient.get<IParcel[]>(`${environment.url}/Parcels?filter[where][createdAt][gte]=${start}`); }
I am getting parcels and then i am sorting them and then showing them on the frontend in their respective fields like this
sortParcels() { this._analyticsDashboardService .getAllParcels(this.today) .pipe( flatMap(res => { return forkJoin([ this._analyticsDashboardService.getAllRepos(), of(res) ]); }), flatMap(res => { const parcelRepos = res[0]; const parcels = res[1]; // bend data to the form needed for (const parcel of parcels) { for (const repo of parcelRepos) { if (parcel.currentStatusId === repo.id) { if (this.sortedParcels[repo.key]) { this.sortedParcels[repo.key].parcels.push(parcel); } else { this.sortedParcels[repo.key] = { parcels: [parcel], label: repo.status, icon: repo.iconClass }; } } } } // initialize not included statuses parcelRepos.forEach(e => { if (!this.sortedParcels[e.key]) { this.sortedParcels[e.key] = { parcels: [], label: e.status, icon: e.iconClass }; } }); // exclude folllowing statuses [ "parcel-picked-up", "awaiting-return-to-vendor", "batch-picked-up", "parcel-assigned-to-rider", "reattempt-delivered", "parcel-assigned-to-rider-for-delivery", "parcel-assigned-to-rider-for-pickup" ].forEach(key => { delete this.sortedParcels[key]; }); // sort data // for (const i in this.sortedParcels) { // for (const j in this.sortedParcels) { // } // } return of(this.sortedParcels); }) ) .subscribe(val => { this.sortedParcels = val; console.log(this.sortedParcels); const temp = []; // forming result for (const key in this.sortedParcels) { if (key) { temp.push({ name: this.sortedParcels[key].label, value: this.sortedParcels[key].parcels.length }); this.totalParcels += this.sortedParcels[key].parcels.length; } // temp.push({ // name: this.sortedParcels[key].label, // value: this.sortedParcels[key].parcels.length // }); // this.totalParcels += this.sortedParcels[key].parcels.length; } this.parcelStatusWidgetResult = temp; console.log(this.parcelStatusWidgetResult); this._cdRef.detectChanges(); });
}
But now i need to separate
today
andlastWeek
parcels from the data that i am getting when i call service. I cannot make another api call so i need some help with code as to how can i separatelastWeek
andtoday's
parcels.This is how i am getting dates using moments js
this.startOfMonth = moment().add(-31, 'days').toISOString(); this.lastWeek = moment().add(-7, 'days').toISOString(); this.today = moment().startOf('day').toISOString();
-
Discord.on_message() missing 1 required positional argument: 'ctx'
I'm working on user level system and now I'm having a big problems with starting my Discord leveling up code and I don't know and don't have any ideas how to fix it.
Here is my code:
import discord import asyncio import json from discord.ext import commands сlass Levels: def __init__(self, client): self.bot = client self.bot.loop.create_task(self.save_users()) with open(r'C:\Users\conex_000\PycharmProjects\Shirochka\users.json', 'r') as f: self.users = json.load(f) async def save_users(self): await self.bot.wait_until_ready() while not self.bot.is_closed(): with open(r'C:\Users\conex_000\PycharmProjects\Shirochka\users.json', 'w') as f: json.dump(self.users, f, indent=4) await asyncio.sleep(5) def lvl_up(self, author_id): cur_xp = self.users[author_id]['exp'] cur_lvl = self.users[author_id]['level'] if cur_xp >= round((4 * (cur_lvl ** 3)) / 5): self.users[author_id]['level'] += 1 return True else: return False async def on_message(self, message): if message.author == self.bot.user: return author_id = str(message.author.id) if not author_id in self.users: self.users[author_id] = {} self.users[author_id]['level'] = 1 self.users[author_id]['exp'] = 0 self.users[author_id]['exp'] += 1 if self.lvl_up(author_id): await message.channel.send(f"{message.author.mention} is now level {self.users[author_id]['level']}") @client.command() async def level(self, ctx, member: discord.Member = None): member = ctx.author if not member else member member_id = str(member.id) if not member.id in self.users: await ctx.send("Can't identify a member") else: embed = discord.Embed(color=member.color, timestamp=ctx.message.created_at) embed.set_author(name=f'Level - {member}', icon_url=client.user.avatar_url) embed.add_field(name='Level', value=self.users[member_id]['level']) embed.add_field(name='Level', value=self.users[member_id]['exp']) await ctx.send(embed=embed)
This code is starting but when I type any message on Discord chat program returns the following error:
TypeError: on_message() missing 1 required positional argument: 'ctx'
-
I'm trying to read from a json file in my ejs template
I've got my data.json file and I want to read from the data stored in the file to './views/modules/header.ejs'. Ideally the data in the json file will be available to all modules in my project.
I'm using expressJs files and want to print the information from my json file to my web page. I've been able to load the data in my app.js file using the code below:
//import json 'use strict'; const fs = require('fs'); let jsonData = fs.readFileSync('data.json', 'utf8'); let data = JSON.parse(jsonData); console.log(data.menu);
This is the content stored in the menu array:
"menu": [ { "title": "Home", "link": "" }, { "title": "Location", "link": "" }, { "title": "Facilities", "link": "" }, { "title": "Building details", "link": "" }, { "title": "Floor plans", "link": "" }, { "title": "Contact", "link": "" } ],
This is my header.ejs file:
<ul class="menu-list"> <% for(var i=0; i < data.menu.length; i++) { %> <li><%= data.data.menu[i].title %></li> <% } %> </ul>
I believe I'm missing a statement on my ejs file which will call the data. Thanks for your help in advance.
this is my index.js file looks like.
var express = require('express'); var router = express.Router(); /* GET home page. */ router.get('/', function(req, res, next) { res.render('index', { title: 'Bank' }); }); module.exports = router;
-
How to do mock a post request in python so that it can accept different input values and return different values based on the input?
This is the original post request
result = requests.post('api/' + tenant_name + '/getdomain', json=query_to_execute, verify=False,timeout=100) json_result = result.json()x
here the query to execute will be aggregate mongodb query of the format
[{$match: {}}, {$limit:25}]
Now I already have a unittest case class
def mock_another_post_request(arg): ret_values=[ [ { "name":"Abcd",'Age':25}, { "name":"asfcd",'Age':45},], [{ "name":"sdfd",'Age':42},] ] return ret_values[arg] class MyTests (unittest.TestCase): def setup(self): patcher = patch('requests.post',new = mock_another_post_request) patcher.start() self.addCleanup(patch.stopall) def MyTestCase(): expected_result =[ { "name":"Abcd",'Age':25}, { "name":"asfcd",'Age':45},] assertEqual expected_result = get_all_docs()
I am very confused on what I am doing , the goal is to mock requests.post and based on the inputs send as part of the payload I should be able to return different values, how do I do that? Thanks in advance.
-
Download file from response header Content-Disposition without a file name - Python
I am trying to scrap information from youtube. Where youtube uses infinite scroll, after every pull ajax calls up for more data. I am using scrapy on python, while i request to this url(with continuation token)
'https://www.youtube.com/results?search_query=tamil&ctoken=xyz&continuation=xyz'
i received the status 200 with the following header.HTTP/2.0 200 OK cache-control: no-cache content-disposition: attachment expires: Tue, 27 Apr 1971 19:44:06 GMT content-type: application/json; charset=UTF-8 content-encoding: br x-frame-options: SAMEORIGIN strict-transport-security: max-age=31536000 x-spf-response-type: multipart x-content-type-options: nosniff date: Mon, 09 Dec 2019 11:59:25 GMT server: YouTube Frontend Proxy x-xss-protection: 0 alt-svc: quic=":443"; ma=2592000; v="46,43",h3-Q050=":443"; ma=2592000,h3-Q049=":443"; ma=2592000,h3-Q048=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000 X-Firefox-Spdy: h2
I just need to download the response json. i can view the response in Chrome and firefox inspector.
here is what i tried.
links = https://www.youtube.com/result?xyxxxx ctoken = xyxxxxxxxx ajax_url = "{links}&ctoken={ctoken}&continuation={ctoken}".format(ctoken=ctoken, links=links) new_data = requests.get(ajax_url).json()
I am getting error on this.
What i am interested is, can i download the response as JSON file for further usage, by making use of content-disposition: attachment. If i need to download the response how can i implement.
-
Post Data to Google Analytics through requests
I'm trying to post data into GA, but I'm getting a indexing error The connection is working as I'm getting response 200, but there seems to be a problem with the for loop which posts all the rows from my dataframe. Anyone who could please help me? thanks!
endpoint = 'http://www.google-analytics.com/collect' payload1 = { 'v' : "1", 't' : "event", 'pa' : "purchase", 'tid' : "xxx", 'cid' : df.iloc[i,0], 'ti' : df.iloc[i,6], 'ec' : "ecommerce", 'ea' : "transaction", 'ta' : "aaaa", 'tr' : df.iloc[i,17], 'cd1' : df.iloc[i,0], 'cd2' : df.iloc[i,6], 'cu' : "bbb", "pr1id" : "ccc", 'pr1nm' : "ddd", 'pr1pr' : df.iloc[i,17], 'pr1qt' : 1, 'cs' : "offline" } for i in df.iterrows(): r = requests.post(url = endpoint , data = payload1, headers={'User-Agent': 'User 1.0'}) time.sleep(0.1) print(r)
Error:
IndexingError Traceback (most recent call last) in 4 'pa' : "purchase", 5 'tid' : "xxx", ----> 6 'cid' : df.iloc[i,0], 7 'ti' : df.iloc[i,6], 8 'ec' : "ecommerce",
~\path\lib\site-packages\pandas\core\indexing.py in getitem(self, key) 1416 except (KeyError, IndexError, AttributeError): 1417 pass -> 1418 return self._getitem_tuple(key) 1419 else: 1420 # we by definition only have the 0th axis
~\path\lib\site-packages\pandas\core\indexing.py in _getitem_tuple(self, tup) 2090 def _getitem_tuple(self, tup): 2091 -> 2092 self._has_valid_tuple(tup) 2093 try: 2094 return self._getitem_lowerdim(tup)
~\path\lib\site-packages\pandas\core\indexing.py in _has_valid_tuple(self, key) 233 raise IndexingError("Too many indexers") 234 try: --> 235 self._validate_key(k, i) 236 except ValueError: 237 raise ValueError(
~\path\lib\site-packages\pandas\core\indexing.py in _validate_key(self, key, axis) 2016 # a tuple should already have been caught by this point 2017 # so don't treat a tuple as a valid indexer -> 2018 raise IndexingError("Too many indexers") 2019 elif is_list_like_indexer(key): 2020 arr = np.array(key)
IndexingError: Too many indexers