AnimeTimm

AnimeTimm is a DeepGHS project for training, testing, and sharing timm-based vision models for anime-style and illustration-focused image tagging.

It is part research playground, part anime-fan workshop: we care about reproducible datasets, model cards, ONNX exports, and practical demos, but the models are also built for people who actually work with 2D art, tags, characters, and visual search.

Project Stewardship

AnimeTimm is produced and maintained by the DeepGHS team and contributors.

This Hugging Face organization is the focused publishing home for AnimeTimm releases: model checkpoints, selected training datasets, and interactive Spaces. The upstream engineering work is connected to the DeepGHS GitHub organization, including the deepghs/animetimm repository.

What We Build

timm-based image tagging and classification models for anime-style images.
Training datasets prepared for large-scale tagger experiments.
PyTorch, Safetensors, and ONNX artifacts where available.
Playgrounds and ranklists for trying models and comparing outputs.

Featured Dataset

`danbooru-wdtagger-v4-w640-ws-full`

The main public dataset release used by the dbv4-full model family. It is a Danbooru-derived WebDataset build for large-scale anime-style multi-label tagging, with images resized so min(width, height) <= 640.

Split	Images	Total Size
train	5,321,713	318 GB
test	295,926	17.7 GB
val	296,957	17.8 GB
total	5,914,596	353.5 GB

Each sample contains the image as webp plus JSON metadata: id, width, height, rating, general_tags, and character_tags. The selected label space has 12,476 tags: 9,225 general tags, 3,247 character tags, and 4 rating tags.

Model Zoo: `dbv4-full`

The tables below focus only on the main dbv4-full model line. Metrics are copied from the corresponding model cards and use the test split reported there.

dbv4-full model performance and parameter snapshot

Top 5 By Macro F1

Rank	Model	Family	Params	Macro@Best F1	Macro@0.40 F1	Micro@0.40 F1
1	convnextv2_huge.dbv4-full	ConvNeXt	692.6M	0.611	0.580	0.697
2	eva02_large_patch14_448.dbv4-full	EVA	316.8M	0.599	0.569	0.693
3	caformer_b36.dbv4-full	CAFormer	134.0M	0.581	0.546	0.689
4	swinv2_base_window8_256.dbv4-full	SwinV2	99.7M	0.575	0.541	0.683
5	caformer_m36.dbv4-full	CAFormer	82.7M	0.559	0.515	0.676

Representative Models By Backbone Family

Each row is the best dbv4-full model currently published for that backbone family.

Family	Model	Params	Macro@Best F1	Macro@0.40 F1	Micro@0.40 F1
ConvNeXt	convnextv2_huge.dbv4-full	692.6M	0.611	0.580	0.697
EVA	eva02_large_patch14_448.dbv4-full	316.8M	0.599	0.569	0.693
CAFormer	caformer_b36.dbv4-full	134.0M	0.581	0.546	0.689
SwinV2	swinv2_base_window8_256.dbv4-full	99.7M	0.575	0.541	0.683
ViT	vit_base_patch16_224.dbv4-full	95.8M	0.540	0.500	0.664
MobileNetV4	mobilenetv4_conv_aa_large.dbv4-full	47.3M	0.511	0.458	0.641
MobileNetV3	mobilenetv3_large_150d.dbv4-full	29.3M	0.462	0.400	0.605
MobileViT	mobilevitv2_200.dbv4-full	30.2M	0.454	0.401	0.608
ResNet	resnet152.dbv4-full	83.7M	0.486	0.448	0.624

Try It

dbv4-full-playground - tag images with pretrained dbv4-full models.
dbv4-full-ranklist - compare the public dbv4-full model lineup.

Maintenance

The source data and chart builder are stored in this Space repository so the organization card can be regenerated without guessing:

data/dbv4_full_models.csv - checked-in metric table.
data/dbv4_full_dataset_summary.json - checked-in featured dataset summary.
data/featured_models.json - top-5 and best-by-family selections.
scripts/build_org_card.py - regenerates the banner and model snapshot chart from the checked-in data.

Acknowledgements

@narugo1992 (GitHub) completed the deepghs/animetimm GitHub project, built the end-to-end data, training, and release pipeline, and carried out the full model training work for AnimeTimm.
@SmilingWolf (GitHub) is gratefully acknowledged for the AnimeTagger idea and for providing mature Danbooru metadata cleaning techniques that made this line of work much more practical.
@7eu7d7 (RainbowNeko / IrisRainbowNeko on GitHub) is gratefully acknowledged for directional guidance during training and tuning.
Thanks to all other DeepGHS contributors who helped shape the surrounding infrastructure, experiments, reviews, and release process.

Notes

These releases are research and hobbyist infrastructure for visual tagging. Please check each model or dataset card for license, source data notes, intended use, and audience restrictions before reuse.