聚合国内IT技术精华文章,分享IT技术精华,帮助IT从业人士成长

An error about multiprocessing of Python

2021-04-08 10:16 浏览: 198 次 我要评论(0 条) 字号:

Our python program reported errors when running a new dataset:

[77 rows x 4 columns]]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647",)'
multiprocessing.pool.MaybeEncodingError: Error sending result: '[                          id  ... email_send_date
    raise self._value
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 644, in get
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/local/lib/python3.6/multiprocessing/pool.py", line 266, in map
    dfs = pool.map(partial(pd.read_parquet, **kwargs), file_list)

Then I found this issue in the python community quickly https://bugs.python.org/issue17560https://bugs.python.org/issue17560. Seems the reason is the multiprocessing mechanism in python only support 32bit to encode object length. And this problem existed even up to Python-3.8

The solution is just using multithreads instead of multiprocessing

Previous code:

with Pool(processes=n_jobs) as pool:
  pool.map(...)

Solution code:

with ThreadPoolExecutor(max_workers=n_jobs) as pool:
  pool.map(...)



网友评论已有0条评论, 我也要评论

发表评论

*

* (保密)

Ctrl+Enter 快捷回复