VirusTotal is a False Positive Factory

You should not take VirusTotal results seriously. The platform lacks any basic quality control.

My Software

I have spent over a thousand hours developing XL Converter.

It allows people to use the latest image technology like lossless JPEG transcoding. It manages binaries, converts, scales, proxies, applies metadata, compares, and reconstructs. Completely free and open-source.

It came as a shock to me that VirusTotal marks it as a “trojan” with 7 detections!

link The 7th one disappeared as I was writing this article (screenshot)

I know they are false positives, because I wrote this program. Since the code is public, you can both verify its safety and reproduce false positives.

VirusTotal is full of no-name solutions with no credibility and high false positive rates.

VirusTotal also scans for network traffic and lists infected IPs. The problem is it detects malware in IPs used by Microsoft. Microsoft dependencies come bundled with those IPs and that’s how they end up in EXEs. This includes most (If not all) software on Windows.

Under these IPs, thousands of incompetent “security researchers” appear to make false judgments with one sane individual writing.

This IP is used by Microsoft, or Microsoft application tracking. Because if you build a clean EXE in Visual Studio 2022 […] you’ll end up with a main application binary that contacts this IP address […]
source

It’s like trying to ban water because criminals drink water.

Further down the line, I spotted this.

Defense Evasion – Reference anti-VM strings targeting Qemu

What? By this point, VirusTotal exists purely to feed into people’s confirmation bias. Also, their URL scanner detects Cloudflare as a phishing site.

How did nobody notice how bad VirusTotal is with false positives? Why is nobody talking about it?

All screenshots will differ from the VirusTotal URLs, because they constantly rescan files.

False Positive Factory

They track false positives, but they don’t do anything about them.

Below are the ones I personally verified have high false positive rates.

Antiy-AVL
MaxSecure
Jiangmin
Qihoo 360
Google
Bkav Pro
SecureAge
SecureAge APEX
Cynet
Sangfor Engine Zero
Zillya
Varist
Ikarus
Webroot
Trapmine
Gridinsoft (no cloud)
Microsoft
Elastic
ArcSight Threat Intelligence
MalwareURL
CyRadar

Every time I scan legitimate software, new solutions pop up as If VirusTotal had an unlimited supply of shovelware. Have a look through all the solutions they use, most look like a scam.

Chocolatey considers up to 5 positives on VirusTotal, not malware for a reason.

Because some scanners can be quite aggressive and may falsely identify a binary as a false positive for malware. […] This means 5 anti-virus scanners need to flag the binary for Chocolatey CLI to stop and fail the install or upgrade
Source

My software got flagged by 7…

Copium

Their solution to this was to establish a whitelist. However, it’s a rich people-only club. They told regular developers to go and eat dirt.

VirusTotal is not responsible for false positives generated by any of the resources it uses
source

VirusTotal became a standard for millions of people. Its huge problem with false positives affects the developers. I still remember my files getting booted off of Archive.org, because VirusTotal detected a virus.

The developers are made to explain themselves, which is humiliating, to say the least.

The quality of this service is abhorrent.

(Not) Detecting Malware

Update: 02-09-2024

While discussing a problem on GitHub Issues, one user was tricked by a Russian hacker into running malware. I hid the content, blocked the profiles, and reported them to GitHub.

To clarify, this specific problem has nothing to do with my program. It’s related to a discussion on GitHub Issues.

I also uploaded the malicious sample to VirusTotal.

While 2/3 of AV solutions identified the threat, the other 1/3 did not. Funnily enough, you can spot the same AV solutions I mentioned as having high false positive rates.

This leads me to my point…

AI/ML

Solutions that misidentify files are often ML-based (“AI”).

The problem with VirusTotal is it does not label them. They are put next to credible solutions. This misleads people.

I propose adding labels for AV solutions with “high false positives” and “mostly ML-based”. Otherwise, people just go around bothering devs with this nonsense…

Societal Impact

People take VirusTotal’s judgment as an indisputable fact. The society has not widely accepted the existence of false positives. Most people only seem to acknowledge them when they’re affected personally. src1 src2 src3 src4

VirusTotal needs quality control. If you offer solutions with high false positive rates, inform users what they are.