Biology

A Beginner's Guide to Using DeepVirFinder for Viral Sequence Identification From Metagenomic Datasets

Document Type

Article

Abstract

Identifying viral sequences from metagenomic datasets is critical for investigating their origins, evolutionary patterns, and ecological functions. Previously, we developed a novel deep learning software, DeepVirFinder, to predict viral sequences from shotgun metagenomic assemblies. This method employs a twin convolutional neural network model to extract features from known viral and prokaryotic host genomic sequences for binary classification of input query sequences. With the rapid accumulation of environmental metagenomic data, this approach has accelerated the discovery of novel viruses from diverse environments through an alignment-free and reference-free deep learning strategy. To facilitate the rapid adoption of this software for beginning users, here we have further improved DeepVirFinder by optimizing its runtime performance, while maintaining the essential user interface of the original version. This comprehensive guide provides basic workflows for the most common use cases of DeepVirFinder. Additionally, to assist users in downstream analyses, supplementary scripts were provided in the software for extracting viral sequences and inspecting the results, thereby helping researchers more effectively mine viral information from metagenomic datasets. © 2026 Wiley Periodicals LLC. Basic Protocol 1: Predicting viral sequences in metagenomic assemblies. Basic Protocol 2: An integrated pipeline for viral sequence analysis: Prediction, extraction, and visualization. Basic Protocol 3: Retraining the DeepVirFinder model using a customized dataset. © 2026 Wiley Periodicals LLC.

Publication Title

Current Protocols

Publication Date

2-2026

Volume

6

Issue

2

ISSN

2691-1299

DOI

10.1002/cpz1.70310

Keywords

deep learning, DeepVirFinder, metagenomics, viromics, viruses

Share

COinS