Package Details: python-datasets 3.1.0-1

Git Clone URL: https://aur.archlinux.org/python-datasets.git (read-only, click to copy)
Package Base: python-datasets
Description: The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Upstream URL: https://github.com/huggingface/datasets
Keywords: datasets deep-learning huggingface machine-learning
Licenses: Apache
Groups: huggingface
Submitter: trougnouf
Maintainer: daskol
Last Packager: daskol
Votes: 3
Popularity: 0.016629
First Submitted: 2021-09-01 14:06 (UTC)
Last Updated: 2024-11-01 16:32 (UTC)

Pinned Comments

daskol commented on 2024-09-11 19:59 (UTC)

WARNING HuggingFace usually needs some time to stabilize major or even minor release version. Sometimes it requires up to 5-6 patch releases. So, be careful updating python-datasets up to major v3 release.

Latest Comments

« First ‹ Previous 1 2

Relih commented on 2023-02-07 21:24 (UTC)

@daskol Sorry, should be fixed now

daskol commented on 2023-02-07 21:23 (UTC)

@Relih Could you attach as a log file or post output as monospace block? It is extremely hard to read and understand what is going on.

Relih commented on 2023-02-07 21:20 (UTC) (edited on 2023-02-07 21:24 (UTC) by Relih)

I am having problems updating this package, no matter what provider I choose for python-huggingface-hub pacman will report unmet dependency errors.

The following is what gets reported if I chose python-huggingface-hub

    ==> Leaving fakeroot environment.
    ==> Finished making: python-huggingface-hub-git 0.13.0.dev0-1 (Di 07 Feb 2023 22:09:01 CET)
    ==> Cleaning up...
     -> Found git repo: github.com/huggingface/huggingface_hub
    loading packages...
    resolving dependencies...
    looking for conflicting packages...
    :: python-huggingface-hub-git and python-huggingface-hub are in conflict. Remove python-huggingface-hub? [y/N] 
    error: unresolvable package conflicts detected
    error: failed to prepare transaction (conflicting dependencies)
    :: python-huggingface-hub-git and python-huggingface-hub are in conflict
    error: target not found: python-huggingface-hub-git
     -> exit status 1

And this if I choose python-huggingface-hub-git I get this after the git package is built

    ==> Making package: python-datasets 2.9.0-1 (Di 07 Feb 2023 22:17:49 CET)
    ==> Checking runtime dependencies...
    ==> Missing dependencies:
      -> python-huggingface-hub>=0.2.0
      -> python-huggingface-hub<1.0.0
    ==> Checking buildtime dependencies...
    ==> ERROR: Could not resolve all dependencies.
    checking dependencies...
    warning: removing python-multiprocess from target list
    warning: removing python-huggingface-hub-git from target list
     there is nothing to do
     -> error making: python-datasets

Freed commented on 2022-11-21 07:14 (UTC)

pip install 'charset-normalizer==2.1.1' can solve the bug temporarily.

Freed commented on 2022-11-19 14:23 (UTC)

❯ datasets-cli --help
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 581, in _build_master
    ws.require(__requires__)
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 909, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 800, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (charset-normalizer 3.0.0 (/usr/lib/python3.10/site-packages), Requirement.parse('charset-normalizer<3.0,>=2.0'), {'aiohttp'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/datasets-cli", line 33, in <module>
    sys.exit(load_entry_point('datasets==1.17.0', 'console_scripts', 'datasets-cli')())
  File "/usr/bin/datasets-cli", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 171, in load
    module = import_module(match.group('module'))
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/usr/lib/python3.10/site-packages/datasets/__init__.py", line 34, in <module>
    from .arrow_dataset import Dataset, concatenate_datasets
  File "/usr/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 61, in <module>
    from .arrow_writer import ArrowWriter, OptimizedTypedSequence
  File "/usr/lib/python3.10/site-packages/datasets/arrow_writer.py", line 28, in <module>
    from .features import (
  File "/usr/lib/python3.10/site-packages/datasets/features/__init__.py", line 2, in <module>
    from .audio import Audio
  File "/usr/lib/python3.10/site-packages/datasets/features/audio.py", line 7, in <module>
    from ..utils.streaming_download_manager import xopen
  File "/usr/lib/python3.10/site-packages/datasets/utils/streaming_download_manager.py", line 15, in <module>
    from aiohttp.client_exceptions import ClientError
  File "/usr/lib/python3.10/site-packages/aiohttp/__init__.py", line 212, in <module>
    from .worker import GunicornUVLoopWebWorker, GunicornWebWorker
  File "/usr/lib/python3.10/site-packages/aiohttp/worker.py", line 11, in <module>
    from gunicorn.config import AccessLogFormat as GunicornAccessLogFormat
  File "/usr/lib/python3.10/site-packages/gunicorn/config.py", line 20, in <module>
    from gunicorn import __version__, util
  File "/usr/lib/python3.10/site-packages/gunicorn/util.py", line 25, in <module>
    import pkg_resources
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 3260, in <module>
    def _initialize_master_working_set():
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 3234, in _call_aside
    f(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 3272, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 583, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 596, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/usr/lib/python3.10/site-packages/pkg_resources/__init__.py", line 795, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'charset-normalizer<3.0,>=2.0' distribution was not found and is required by aiohttp

ModYokosuka commented on 2022-09-22 19:04 (UTC)

Could you disown this so i can adopt?

trougnouf commented on 2021-12-23 21:32 (UTC)

Updated, thanks for the notification. I no longer use this package so if anyone wants to take it over please let me know and I will disown.

Ayaka commented on 2021-11-10 07:43 (UTC)

>>> from datasets import load_dataset
ModuleNotFoundError: No module named 'fsspec'
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_896541/660973178.py in <module>
----> 1 from datasets import load_dataset

/usr/lib/python3.9/site-packages/datasets/__init__.py in <module>
     31     )
     32 
---> 33 from .arrow_dataset import Dataset, concatenate_datasets
     34 from .arrow_reader import ArrowReader, ReadInstruction
     35 from .arrow_writer import ArrowWriter

/usr/lib/python3.9/site-packages/datasets/arrow_dataset.py in <module>
     32 from typing import TYPE_CHECKING, Any, BinaryIO, Callable, Dict, Iterator, List, Optional, Tuple, Union
     33 
---> 34 import fsspec
     35 import numpy as np
     36 import pandas as pd

ModuleNotFoundError: No module named 'fsspec'