A joint investigation by SentinelLABS and Censys says open-source model deployment is creating a large, publicly reachable layer of AI compute that sits outside the guardrails common on major AI platforms.
The researchers recorded 7.23 million observations from 175,108 unique Ollama hosts across 130 countries over a 293-day period, then analyzed how those endpoints behave and what they advertise.
The firms say the risk is not tied to a newly discovered software vulnerability. It is driven by exposure choices that repeat across thousands of operators.
Ollama runs locally by default and binds to 127.0.0.1:11434, meaning it is accessible only from the same machine unless reconfigured.
Its documentation explains that changing the bind address via OLLAMA_HOST can expose the service on a network. At scale, the researchers argue, these configuration decisions add up to an internet-facing surface that can be reused for unintended workloads.
Persistence makes abuse more predictable
The researchers reported a “core” population of always-on nodes that generated most observations, which they said suggests the ecosystem extends beyond short-lived hobby servers to include stable endpoints that could be repeatedly reused.
This matters because reliability changes attacker economics. Opportunistic abuse is sporadic. A persistent backbone enables repeated use, iteration and operational planning, especially when endpoints are reachable without friction.
Tool calling shifts the threat model from “bad text” to “actions”
The researchers reported that a large share of observed hosts advertised tool-calling capabilities, meaning the model could invoke external APIs or system functions rather than only generate text.
That risk aligns with common failure modes in the emerging LLM security canon, including prompt injection, where an attacker manipulates model instructions to override intended behavior, and sensitive information disclosure.
OWASP, an open-source application security consortium, tracks these issues as top-tier risks for LLM applications.
Independent research has pointed to similar exposure patterns. For instance, a 2025 Cisco study using Shodan described discovering more than 1,100 exposed Ollama servers and argued for baseline security controls around LLM deployments.
The scale differs, but the direction is consistent: local-first tooling is frequently being pushed onto networks without enterprise-grade hardening.
Guardrails can be removed at the prompt layer
Beyond infrastructure exposure, the researchers also analyzed system prompts that were visible through some API responses.
They reported finding at least 201 hosts running standardized “uncensored” prompt templates, which explicitly remove or weaken built-in safety instructions, noting this is a lower bound due to visibility limits in their method.
Governance gaps show up as attribution gaps
The researchers also highlighted response challenges, noting that while internet scanning can identify exposed endpoints, tying those endpoints to accountable owners is often difficult when hosting details are incomplete or unclear.
The geographic distribution of these exposures further complicates the governance landscape; researchers found that China accounts for the largest share of exposed hosts at approximately 30%, followed by the United States at just over 20%.
The researchers describe attribution friction as a practical governance constraint, especially when the ecosystem spans consumer networks, VPS providers and cloud environments.
That maps to a broader policy reality: open model capability can be produced by a small number of labs, but deployment decisions are pushed to countless downstream operators.
NIST’s AI Risk Management Framework describes risk management as a lifecycle discipline that spans development and deployment contexts, which becomes more difficult when the deployment layer is fragmented and unevenly governed.
Security frameworks such as NIST’s AI Risk Management Framework and MITRE’s SAFE-AI guidance recommend treating exposed LLM endpoints like other internet-facing services, including applying authentication, network segmentation, monitoring and least privilege.