aac_datasets.datasets.wavcaps module¶
- class WavCaps(
- root: str | Path | None =
None, - subset: 'audioset' | 'bbc' | 'freesound' | 'soundbible' | 'audioset_no_audiocaps_v1' | 'freesound_no_clotho_v2' =
'audioset_no_audiocaps_v1', - download: bool =
False, - transform: Callable[[WavCapsItem], Any] | None =
None, - verbose: int =
0, - force_download: bool =
False, - verify_files: bool =
False, - *,
- clean_archives: bool =
False, - hf_cache_dir: str | None =
None, - repo_id: str | None =
None, - revision: str | None =
'85a0c21e26fa7696a5a74ce54fada99a9b43c6de', - zip_path: str | Path | None =
None, Bases:
AACDataset[WavCapsItem]Unofficial WavCaps PyTorch dataset.
WavCaps Paper : https://arxiv.org/pdf/2303.17395.pdf HuggingFace source : https://huggingface.co/datasets/cvssp/WavCaps
This dataset contains 4 training subsets, extracted from different sources: - BBC Sound Effects “bbc” - SoundBible “soundbible” - AudioSet strongly labeled without AudioCaps V1 val and test subsets “audioset_no_audiocaps_v1” - FreeSound without Clotho dev, val, eval and test subsets “freesound_no_clotho_v2”
Other subsets exists but they does not comply DCASE Challenge rules: - AudioSet strongly labeled “audioset” - FreeSound “freesound”
Warning
WavCaps download is experimental ; it requires a lot of disk space and can take very long time to download and extract, so you might expect errors.
Dataset folder tree¶{root} └── WavCaps ├── Audio │ ├── AudioSet_SL │ │ └── (108317 flac files, ~64GB) │ ├── BBC_Sound_Effects │ │ └── (31201 flac files, ~142GB) │ ├── FreeSound │ │ └── (262300 flac files, ~1.4TB) │ └── SoundBible │ └── (1232 flac files, ~884MB) ├── Zip_files │ ├── AudioSet_SL │ │ └── (8 zip files, ~76GB) │ ├── BBC_Sound_Effects │ │ └── (26 zip files, ~562GB) │ ├── FreeSound │ │ └── (123 zip? files, ~1.4TB) │ └── SoundBible │ └── (1 zip? files, ~624GB) ├── json_files │ ├── AudioSet_SL │ │ └── as_final.json │ ├── BBC_Sound_Effects │ │ └── bbc_final.json │ ├── FreeSound │ │ ├── fsd_final_2s.json │ │ └── fsd_final.json │ ├── SoundBible │ │ └── sb_final.json │ └── blacklist │ ├── blacklist_exclude_all_ac.json │ ├── blacklist_exclude_test_ac.json │ └── blacklist_exclude_ubs8k_esc50_vggsound.json ├── .gitattributes └── README.md-
CARD : ClassVar[WavCapsCard] =
<aac_datasets.datasets.functional.wavcaps.WavCapsCard object>¶
- property subset : 'audioset' | 'bbc' | 'freesound' | 'soundbible' | 'audioset_no_audiocaps_v1' | 'freesound_no_clotho_v2'¶
-
CARD : ClassVar[WavCapsCard] =