Skip to main content
Back to blog
Brian

The patch installed cleanly. The vulnerable code may still be running.

Package managers show what changed on disk. They do not always show whether long-running services still have old libraries mapped in memory, or whether a host is still booted into the vulnerable kernel.

Vulnerability ManagementLinux HardeningApplied vs Live

It is late in the patch window. The update job finished. dnf update returned cleanly. The change ticket has a nice little green check next to it. The vulnerability scanner is already starting to look happier.

Somebody asks the question every on-call engineer has heard:

Are we good to close this?

And the tempting answer is yes. The fixed package is installed, so the host is fixed.

Sometimes that is true. Sometimes it is only half true.

On Linux, package state and runtime state are not the same thing. A package manager can tell you what is installed on disk. It cannot always tell you what code is still executing in memory. A service can keep an old shared library mapped after the RPM has been replaced. A kernel fix can be installed while the box is still booted into the vulnerable kernel. A patch job can succeed while the real remediation work is still waiting on a service restart, a drain, a reboot, or a second look from the person who owns the workload.

That is the gap between patched on disk and fixed in practice.

It is also the gap where a lot of sysadmin and DevOps pain lives.

The package database is only one view of the host

On RHEL-family systems, tools like rpm, dnf, vendor advisories, OVAL data, and scanner plugins usually start from installed package metadata. They answer useful questions:

  • Which package is installed?
  • Which version-release is installed?
  • Does the vendor say this build contains the security fix?
  • Is the installed package older than the fixed package?

That is good evidence. You want that evidence.

But it is not the whole host.

The question operators usually care about after patching is more direct:

Is the vulnerable code still running right now?

That answer lives in runtime state.

For kernels, the split is easy to see:

uname -r
rpm -q --last kernel | head

The newest installed kernel package is not necessarily the kernel the machine is running. It is only the kernel the machine can boot into next.

Shared libraries have a similar problem. Updating an RPM replaces files on disk, but long-running processes do not automatically reload every object they already mapped. sshd, nginx, postgres, JVMs, sidecars, and internal daemons can keep using old code until the process restarts.

You can often see the rough shape of this with tools operators already reach for:

lsof +L1

or by looking at process maps:

grep -R "deleted" /proc/*/maps 2>/dev/null | head

When an old library mapping shows up as deleted, the file on disk may already be gone, but a process can still have that object mapped. The package manager is not lying. The runtime is not lying either. They are answering different questions.

That is what makes this class of issue so easy to miss.

This is why patch windows get weird

The clean version of vulnerability management is simple:

  1. Find vulnerable package.
  2. Install fixed package.
  3. Close finding.

The real version usually has more texture:

  • Some hosts update cleanly.
  • Some hosts miss the window.
  • Some boxes need reboots, but they are running stateful workloads.
  • Some services need restarts, but the owning team is not around.
  • Some scanners close findings because the package version changed.
  • Some scanners keep findings open because they misunderstand vendor backports.
  • Somebody exports a CSV.
  • Somebody else asks why the dashboard says green when the service never restarted.

That is not sloppy operations. That is normal operations.

Linux fleets have maintenance windows, uptime requirements, clusters, vendor backports, fragile services, old hosts, and just enough tribal knowledge to keep everyone humble. The tooling should account for that reality instead of pretending patching is one command and a victory lap.

If a finding only says “update OpenSSL,” it is not giving the operator enough information. After the package update, the next step might be:

  • restart nginx
  • restart sshd during an approved window
  • restart a custom service that loaded libssl
  • reboot the host into the fixed kernel
  • do nothing because the affected package is installed but not live in any relevant process
  • investigate because the host has not reported since the patch job

Those are different tasks. They have different owners, different blast radii, and different maintenance implications.

“Patch the package” is a starting instruction. It is not always a completion condition.

The common cases are not exotic

This problem can sound niche until you look for it across enough Linux systems. The usual cases are painfully familiar.

Kernel updates waiting on reboot

The fixed kernel package is installed, but the machine is still running the old kernel.

That might be fine for an hour while a cluster drains. It might be understandable until the next maintenance window. But it should not be counted as remediated just because the fixed package exists on disk.

For a single laptop, that is annoying. Across production hosts, it becomes a scheduling and risk problem:

  • Which machines are still exposed?
  • Which can reboot now?
  • Which need draining?
  • Which are part of a cluster where reboot order matters?
  • Which have been “pending reboot” for three patch cycles?

That last one is where everyone starts looking at the floor.

Shared libraries still mapped into long-running processes

OpenSSL, glibc, libxml2, krb5, NSS, curl, and similar libraries are used all over a Linux system. Updating the package does not restart every process that already loaded the old object.

The library is fixed on disk. The process may not be.

That distinction matters for services that stay up for weeks or months. It matters for anything customer-facing. It matters for old internal services that nobody wants to restart because the runbook is three sentences and one of them says “ask Mike.”

Deleted-but-still-open files

Linux lets a process keep using a file after it has been replaced or unlinked. That behavior is useful and normal. It is also exactly why a scanner that only checks installed package versions can miss the operational follow-through.

You patched the host. Good.

But a process may still have the old inode open.

Vendor backports and version confusion

Enterprise Linux distributions often backport security fixes. A RHEL package can look old compared with upstream while still containing the vendor’s fix.

Anyone who has argued with a scanner about Red Hat version numbers knows this problem.

Naive version comparison can be wrong in both directions:

  • It can mark a fixed vendor package as vulnerable because the upstream version number looks old.
  • It can mark a host as remediated because the package changed, while ignoring whether the fixed code is live.

Good vulnerability handling on enterprise Linux has to respect both vendor advisory data and runtime state. One without the other creates either noise or false confidence.

What a useful finding should say

A finding that helps an operator should not stop at:

CVE present. Update package.

It should tell you where the work actually stands.

For a shared library issue, useful output looks more like this:

CVE-20XX-1234 affects openssl-libs.

Applied state:
  fixed package installed: openssl-libs-1:3.0.x-y.el9

Live state:
  nginx worker processes still map the old libssl object
  affected pids: 1842, 1843, 1844

Recommended action:
  restart nginx during the approved service window

Verification:
  re-check process maps after restart and confirm the old mapping is gone

For a kernel issue, it should say something different:

Applied state:
  fixed kernel package installed

Live state:
  running kernel is still 5.14.0-xxx.el9

Recommended action:
  reboot host into the fixed kernel

That is the difference between a CVE list and an operations queue.

It turns an abstract vulnerability into a concrete task. It gives the person on call a next action. It helps a security team understand whether a finding is still live risk or just leftover package metadata. It helps a manager understand why “we patched” and “we are done” are not always the same sentence.

It also changes prioritization. If you have 189 CVEs in a queue, not all of them deserve the same Saturday. A finding tied to CISA KEV, a high EPSS score, internet exposure, and live vulnerable code should come before a low-likelihood issue in a package that is installed but not actually running.

The goal is not to make dashboards scarier.

The goal is to make them more honest.

”Clean” should mean verified, not assumed

Most teams already do some version of the right loop:

  1. Find vulnerable package.
  2. Apply update.
  3. Restart service or reboot host.
  4. Check that the issue is actually gone.
  5. Close the ticket.

The messy part is that steps 3 and 4 often live outside the scanner. They live in Slack threads, runbooks, maintenance notes, tribal knowledge, and “I think that host rebooted last night.”

That fuzziness creates two bad outcomes.

The first is toil. Engineers spend time chasing findings that are already handled, or decoding scanner output that does not map cleanly to the work they need to do.

The second is false closure. A dashboard goes green while vulnerable code is still live, or a ticket closes because the package version changed even though the service never restarted.

Neither outcome helps the people carrying the pager.

What oxharden checks

oxharden is built around the applied-vs-live split.

The package database still matters. It is part of the evidence. But it is not treated as the whole answer.

For Linux hosts, oxharden looks at:

  • installed package state
  • vendor-fixed package and advisory context
  • running kernel state
  • running processes
  • mapped shared libraries
  • deleted or stale mappings where applicable
  • restart and reboot debt
  • exploit prioritization signals like KEV and EPSS
  • compliance and hardening evidence for teams that also need audit artifacts

The goal is to answer the question an operator actually has after patching:

What is still live, what needs to restart, and what can I prove is fixed?

That answer should work at the host level and across the fleet. A single box may need a service restart. A group of hosts may need the same remediation. A set of kernel findings may need a reboot plan. A compliance owner may need evidence that a fix was not just installed, but verified.

The dashboard should tell the truth

Security tooling should not make the fleet look cleaner than it is.

If a host is patched on disk but still running vulnerable code, say that.

If the fix requires a reboot, say that.

If a specific service restart should clear the finding, say which service.

If a host has not reported since the patch, call it a blind spot instead of quietly counting it as clean.

If a CVE is being exploited in the wild, let that change the order of work.

That is the practical difference between “we installed updates” and “we removed the exposure.”

Every Linux team already knows patching is not always neat. There are old hosts, odd services, maintenance windows, vendor backports, uptime requirements, and scanners that do not always agree with each other.

The tooling should meet operators in that reality.

Patched on disk is a good start. Fixed means the vulnerable code is no longer running.