About
Hanif Rahman
Independent researcher. Computational biology, Pashto NLP, and patient education.
I work across Pashto language technology, kidney disease research, and public education. The common thread is infrastructure: datasets, benchmarks, explanations, and tools that let other people reason from clearer evidence.
My recent language work focuses on Pashto, a large language with little public NLP and speech infrastructure. The current research programme covers corpus construction, Common Voice growth, multilingual speech recognition, script fidelity, and reproducible evaluation.
The kidney disease work is written for a different reader: patients, families, and clinicians who need accurate explanations without a pile of journal articles. My IgA nephropathy guide and articles turn clinical and biological ideas into practical questions people can bring to appointments.
If you are working on Pashto data, speech recognition, kidney disease communication, or computational biology, I would be glad to hear from you.
Recent papers
Benchmarking Multilingual Speech Models on Pashto: Zero-Shot ASR, Script Failure, and Cross-Domain Evaluation
arXiv 2026, arXiv:2604.04598
Script collapse in multilingual ASR: A reference-free metric and 100-pair benchmark
arXiv 2026, arXiv:2604.08786
Fine-tuning Whisper for Pashto ASR: strategies and scale
arXiv 2026, arXiv:2604.06507