Auditing GNU/Linux with OSQuery

September 15, 2022

Intro

Many incident responders struggle with Linux as exposure is limited, considering that corporate environments rarely have user interaction within a Linux desktop environment (outside of the occasional sysadmin). This prevents analysts from “knowing normal”, thus the processes and activity quickly become an enigma. Attacks in these environments rarely have a delivery mechanism tailored towards users, such as maldocs contained in emails or staged on compromised/malicious sites. Most attacks inevitably occur from misconfigured or unpatched services that interface with the public web, which commonly are vulnerabilities within components of web applications or respective plugins. The common post-exploit activity tends to be low-level crypto-mining malware with surface level persistence.

The funny circular dichotomy of the situation also results in anti-virus (AV) vendors and EDR/MDR services picking up solely on basic string detections and signatures. Behavioral detections and advancements on the AV front often fall short of what could be implemented. We can already see the cascade of consequences stemming from frequency, demand, and probability. Yet most environments today still rely on Linux in their infrastructure.

To add insult to injury, often logging is more or less configured in a poor fashion, and the identification of initial access may be hidden in archived log files dating several months back.

It does not take much imagination to see what causes confusion for analysts. There are a few ways to unravel the activity. Evaluating patch levels for services, persistence, the remote footprint, and web logs start to shine a light on what has occurred.

Triage

There isn’t necessarily a swiss army knife for Linux investigations, however I’m going to paint a barebones scenario. A customer calls in requesting a threat hunt on their Linux server that has not been enrolled in X security service. The vendor/analyst(s) have little to no visibility into actions that have taken place, therefore all journals that could be used to correlate activity are out the window.

Credit for some of the established queries go to Pherba, as well as the linux persistence mapping diagram in the figure below.

Persistence

Pivot 1 - Persistence

The most obvious pivot for quick wins would be to query persistence on the host. This can quickly identify references to scripts staged in different areas of the filesystem. Start from easiest, most common persistence to least common. This would following something such as: crontabs -> set aliases -> modified / created service files -> boot persistence

select * from crontab

Check last_modified / creation_time of boot, startup, and service files

SELECT
f.path,
u.username,
g.groupname,
datetime(f.atime,‘unixepoch’) AS last_access,
datetime(f.mtime,‘unixepoch’) AS last_modified,
datetime(f.ctime,‘unixepoch’) AS last_status_change,
datetime(f.btime,‘unixepoch’) AS creation_time,
ROUND((f.size * 10e-7),4) AS size_mb
FROM file f
LEFT JOIN users u ON f.uid = u.uid
LEFT JOIN groups g ON f.gid = g.gid
-- Check for recent modifications (last_modified)
WHERE (path LIKE ‘/home/%/.%rc’ OR path LIKE ‘/root/.%rc’
-- Check last_modified for on boot persistence
OR path LIKE ‘/etc/rc.local’ OR path LIKE ‘/etc/profile’ OR path LIKE ‘/etc/profile.d/%’ OR path LIKE ‘/etc/bashrc’ OR path LIKE ‘/etc/bash.bashrc’)
-- Check systemD/run level scripts
-- OR path LIKE ‘/etc/rc%.d/%.sh’ OR ‘/etc/systemd/system/%‘)
and f.mtime > strftime(‘%s’, ‘now’, ‘-2 months’)

Kernel modules loaded via LD_PRELOAD

SELECT process_envs.pid as source_process_id, process_envs.key as environment_variable_key, process_envs.value as environment_variable_value, processes.name as source_process, processes.path as file_path, processes.cmdline as source_process_commandline, processes.cwd as current_working_directory, ‘T1055’ as event_attack_id, ‘Process Injection’ as event_attack_technique, ‘Defense Evasion, Privilege Escalation’ as event_attack_tactic
FROM process_envs
JOIN processes USING (pid)
WHERE key = ‘LD_PRELOAD’;

Pivot 2 - Shell History w/ optional filters provided for elevated privileges and specific UID

SELECT DISTINCT command as cmd, datetime(time, ‘unixepoch’, ‘UTC’) as run_time, uid FROM shell_history
-- WHERE
-- command LIKE ‘sudo %‘` OR command LIKE 'doas %'
-- AND uid=‘%’

Pivot 3 - Running Processes w/ optional string filters

SELECT name, pid, datetime(start_time, ‘unixepoch’, ‘UTC’) as start_time, cmdline, state
FROM processes
-- WHERE
-- uid=0
-- (cmdline LIKE ‘%keygen%’ OR cmdline LIKE ‘%authorized_key%’ OR cmdline LIKE ‘%cron%’ OR cmdline LIKE ‘%useradd%’ OR cmdline LIKE ‘%nc -l%’ OR cmdline LIKE ‘%ncat -%’ OR cmdline LIKE ‘%.php%’ OR cmdline LIKE ‘%ufw disable%’ OR cmdline LIKE ‘%iptables -F%’ OR cmdline LIKE ‘%sysctl%’ OR cmdline LIKE ‘%apparmor%’ OR cmdline LIKE ‘%SELINUX%’ OR cmdline LIKE ‘%chattr%’ OR cmdline LIKE ‘%setfacl%’ OR cmdline LIKE ‘%hist%’ OR cmdline LIKE ‘%wget%’ OR cmdline LIKE ‘%curl%’ OR cmdline LIKE ‘%xmrig%’ OR cmdline LIKE ‘%krebs%’ OR cmdline LIKE ‘%monero%’ OR cmdline LIKE ‘%miner%‘)

Processes with linked parent execution:

WITH pstree AS (
  SELECT 0 as level, pid, name, parent, name as pparent, uid,  cast(uid as varchar(10)) puid
  FROM processes   WHERE parent = 0
  UNION ALL
  SELECT level+1 as level, t.pid, t.name, t.parent, pstree.pparent || ‘->’ || t.name as pparent, t.uid, pstree.puid || ‘->’ || t.uid as puid
   FROM processes t   INNER join pstree on t.parent = pstree.pid )
SELECT level, pid, name, pparent as process_chain, puid as user_chain  FROM pstree;

Pivot 4 - Open Sockets w/ optional filter

SELECT p.pid, p.name, pos.remote_address
FROM processes p, process_open_sockets pos
-- WHERE p.pid = pos.pid and remote_address not in (‘’,‘0.0.0.0’,‘::’);

SELECT pid, remote_address, local_port, remote_port, s.state, p.name, p.cmdline, p.uid, username
FROM process_open_sockets  AS s
JOIN processes AS p
USING(pid)
JOIN users
USING(uid)
WHERE
s.state = ‘ESTABLISHED’
OR s.state = ‘LISTEN’;

Pivot 5 - Suspicious Binaries

Check for suspicious binaries (Only effective if a baseline normal is established). For inexperienced analysts, this will likely establish confusion over clarity unless a malicious binary has a blatantly malicious name, such as randomized strings or cryptomining references.

SELECT * from suid_bin

Pivot 6 - User information

SELECT
username, gid, directory, shell, description, type
FROM users
WHERE shell LIKE ‘%sh’
-- AND uid=0

Pivot 7 - Staged SSH keys

Often attackers will implant SSH keys to establish a foothold on a host.

SELECT * FROM user_ssh_keys

SELECT authorized_keys.*
FROM users
JOIN authorized_keys
USING(UID)

Pivot 8 - Suspicious Processes

SELECT name, cmdline, pid, datetime(start_time, ‘unixepoch’, ‘UTC’) as start_time, parent, state
FROM processes
WHERE
    parent = 1
    AND regex_match(cmdline, “python|bash|\bsh\b”, 0) IS NOT NULL;

Conclusion

Inevitably, some pivots are going to be more effective from a live terminal session. OSQuery is a powerful tool, however it has some limitations such as displaying hidden files. Outside of currently running processes that are listening / have open sockets, network traffic analysis is non-existent, especially when the investigation is purely retro-active. There are proprietary tables that could provide more utility to OSQuery such as network journals, process journals, grep tables (allow enumeration of files) that would aide investigations had they been open-source.

Vendor/Community provided queries:

Additional OSQuery extensions:

https://github.com/trailofbits/osquery-extensions

Linux