Biology
A Beginner's Guide to Using DeepVirFinder for Viral Sequence Identification From Metagenomic Datasets
Document Type
Article
Abstract
Identifying viral sequences from metagenomic datasets is critical for investigating their origins, evolutionary patterns, and ecological functions. Previously, we developed a novel deep learning software, DeepVirFinder, to predict viral sequences from shotgun metagenomic assemblies. This method employs a twin convolutional neural network model to extract features from known viral and prokaryotic host genomic sequences for binary classification of input query sequences. With the rapid accumulation of environmental metagenomic data, this approach has accelerated the discovery of novel viruses from diverse environments through an alignment-free and reference-free deep learning strategy. To facilitate the rapid adoption of this software for beginning users, here we have further improved DeepVirFinder by optimizing its runtime performance, while maintaining the essential user interface of the original version. This comprehensive guide provides basic workflows for the most common use cases of DeepVirFinder. Additionally, to assist users in downstream analyses, supplementary scripts were provided in the software for extracting viral sequences and inspecting the results, thereby helping researchers more effectively mine viral information from metagenomic datasets. © 2026 Wiley Periodicals LLC. Basic Protocol 1: Predicting viral sequences in metagenomic assemblies. Basic Protocol 2: An integrated pipeline for viral sequence analysis: Prediction, extraction, and visualization. Basic Protocol 3: Retraining the DeepVirFinder model using a customized dataset. © 2026 Wiley Periodicals LLC.
Publication Title
Current Protocols
Publication Date
2-2026
Volume
6
Issue
2
ISSN
2691-1299
DOI
10.1002/cpz1.70310
Keywords
deep learning, DeepVirFinder, metagenomics, viromics, viruses
Repository Citation
Mo, Yuqian; Ahlgren, Nathan; Fuhrman, Jed A.; Sun, Fengzhu; and Hou, Shengwei, "A Beginner's Guide to Using DeepVirFinder for Viral Sequence Identification From Metagenomic Datasets" (2026). Biology. 446.
https://commons.clarku.edu/faculty_biology/446
