Skip to main content

Command Palette

Search for a command to run...

May Highlights

BreachForce Meetup May - Security Automation and Malware Research

Updated
19 min read
May Highlights

Talk 1 - Security Automation with AI & Telegram Bots - Dhiraj Ambigapathi

Mindset & Philosophy

  • Go to questionable forums, research communities, GitHub projects, and niche corners of the internet.

  • Figure things out as you go; there is no fixed roadmap in cybersecurity.

  • Automation already exists in cybersecurity:

    • SIEM correlation

    • Log analysis

    • Bug bounty scripting

    • Source code review tools

    • Vulnerability scanners

  • The next logical step is using AI to orchestrate and automate those existing workflows.

  • AI should automate repetitive tasks, not replace human analysts.

  • Human-in-the-loop (HITL) should be mandatory for important decisions.

  • Never blindly trust AI outputs; validation is required.


Questions That Drove the Project

  • How do I find subdomains while drinking coffee?

  • How do I analyze PCAPs while sleeping?

  • How do I continuously track new CVEs?

  • How do I identify internet-facing systems affected by those CVEs?

  • How do I reduce time spent on repetitive reconnaissance?


AI Security Automation Goals

  • Automate reconnaissance.

  • Automate vulnerability intelligence gathering.

  • Automate data collection and summarization.

  • Allow security professionals to focus on:

    • Exploitation

    • Validation

    • Investigation

    • Decision making


  • XBOW AI reached the #1 position on the H1 leaderboard among security agents.

  • Mythos and similar platforms demonstrate AI-assisted bug hunting.

  • AI-assisted security research is becoming practical.


Claude Code & Skills

  • Claude Code was selected because:

    • Supports MCP (Model Context Protocol).

    • Supports Skills.

What are Skills?

Skills are essentially instruction files that teach Claude:

  • How to behave.

  • How to perform specific tasks.

  • When to use tools.

  • How to follow workflows.

Examples:

  • SSRF testing

  • SQL Injection testing

  • XSS testing

  • Tool selection logic

  • Workflow execution rules

Think of Skills as operational playbooks for the LLM.


Infrastructure Architecture

Master Node

Runs:

  • N8N

  • Workflow orchestration

  • Telegram bot integration

  • AI communication

Slave Node

Runs:

Design Philosophy

  • Separate brains from muscle.

  • Internet exposure should be minimal.

  • Defense in depth.

  • Master communicates with worker over SSH.

  • Workers remain isolated.


Why Telegram?

Telegram was chosen because:

  • Bot APIs are mature.

  • Easier automation.

  • Public static IP ranges are available.

  • Simpler network whitelisting.

Observation:

  • Telegram IPs appear to be static and easier to whitelist.

  • WhatsApp and Discord don't provide the same level of predictable static IP visibility for this architecture.


N8N as the Gatekeeper

N8N acts as:

  • Input validator

  • Access controller

  • Workflow orchestrator

  • Human approval checkpoint

Before any scan:

  • Validate input.

  • Validate domain format.

  • Validate permissions.

Examples:

  • Regex-based domain validation.

  • User authorization checks.


Human-In-The-Loop Workflow

  1. User submits target.

  2. Recon runs.

  3. Results sent to user.

  4. User approves next phase.

  5. Vulnerability scans run.

  6. Results summarized.

  7. User approves AI validation.

  8. Final report generated.

No fully autonomous offensive actions.


External Attack Surface Workflow

Enumeration

Historical Domains

SecurityTrails is used for:

  • Historical DNS records

  • Historical subdomains

  • Enumeration enrichment

Historical domains often reveal:

  • Forgotten assets

  • Legacy infrastructure

  • Shadow IT


Vulnerability Assessment Workflow

Discovery

  • Nmap

  • Masscan

  • Shodan

Validation

  • Nuclei

  • OpenVAS

  • Nmap scripts

Reporting

  • AI summarizes findings.

  • Human validates conclusions.


Additional Capabilities

OSINT

  • Spiderfoot

GitHub Exposure Hunting

  • Search GitHub for:

    • API keys

    • Secrets

    • Credentials

External APIs

  • Chaos API (ProjectDiscovery)

Frameworks

  • Frogy 2.0

  • reNgine

Repository used:


CVE Intelligence Automation

Current Process

  • Pull latest CVEs from RSS feeds.

  • Focus on recent vulnerabilities (for example last hour).

  • Extract:

    • CVE ID

    • Product

    • Vendor

    • Severity

Exposure Validation

After collecting CVEs:

  • Query Shodan.

  • Determine:

    • How many systems are exposed.

    • Which services are vulnerable.

    • Internet-facing exposure.

Goal:

"Which newly released vulnerabilities are currently exploitable on internet-facing systems?"

Important Note

The $5 Shodan plan does not provide all advanced vulnerability filters.


PCAP Analysis Workflow

Separate workflow from web reconnaissance.

Typical flow:

  1. PCAP ingestion.

  2. Zeek processing.

  3. Suricata analysis.

  4. Artifact extraction.

  5. AI summarization.

  6. Human review.

Tools:

  • Zeek

  • Suricata

  • Binwalk


Data Processing & Storage

  • Raw scan data stored on filesystem.

  • Structured outputs are critical.

  • Use unique directories:

    • Timestamp based

    • Hash based

Avoid:

  • Shared output.txt files

  • Race conditions

  • Data collisions


ARM Challenges

Deployment was done on Raspberry Pi.

Challenges:

  • ARM architecture compatibility.

  • Cross-compilation required.

  • Some security tools required custom builds.

  • MCP servers and dependencies needed ARM support.


Claude Code Economics

  • Claude Code usage can be relatively inexpensive.

  • Workflows can run unattended for hours.

  • Suitable for long-running automation tasks.


Telegram Operational Lessons

Telegram has a hard limit:

  • 4096 characters per message.

Solution:

  • Chunk long outputs.

  • Split reports automatically.

  • Send multi-part messages.


AI Safety Lessons

Things that can go wrong:

  • Prompt injection.

  • Rogue agents.

  • Production database deletion.

  • Sensitive data leakage.

  • API key exposure.

Examples cited:

  • Claude/Cursor incidents.

  • ChatGPT API keys exposed on GitHub.

  • Agent manipulation attacks.


Malware Analysis & AI

Thomas Roccia's observation:

Malware analysis is no longer purely a human problem.

AI can assist with:

  • Triage

  • Classification

  • IOC extraction

  • Pattern recognition

  • Report generation

But final analyst validation remains important.

Reference: https://blog.securitybreak.io/malware-reverse-engineering-is-no-longer-a-human-problem-5441e4a0564fa


Talk 2 - The Malware Researcher's Roadmap (Open Talk) - Adhokshaj Mishra

  • Why did we enroll in Engineering? Was it for Money? Was it because our parents told us too? Was the motivation something else?

  • After completing engineering, why do we not have a happy ending? Because that's what we were told that after we get a degree, we will eventually get a job. Thus, leading to a happy ending!

  • "Life set kyu nahi hai fir?" Problem - In colleges, we learnt what to learn. Its called RATTI-FICATION

  • But why did we rattify things? Why didnt we ask any questions?

  • We didnt ask any questions are the good citizens of our country.

  • "Good citizens don't ask questions" - Mishra Ji

  • We spent the whole college life dealing with mid sem, end sem, terms, minor projects, major projects, assignments, etc.

  • OH SHIT! Whatever we learnt doing all the above things. We didn't use any of them!

  • "Jo engineering mai padha uska use hi nahi" - Mishra Ji

  • Since early times we were told to Excel in Excel which we did but still we are not excelling in life. Why is this happening?

  • Why does this Excel in Excel tragedy does not happen to folks in US/UK? Where is the problem?

  • The subjects are same. The syllabus is same. The degree is same. Then where is the damn problem?

  • Why are there variations in the outcomes of the degree for both of us?

  • The problem is we have been taught on what to learn but not how to learn!

  • Seekhna kaise hai woh koii nahi seekhata

  • In School life, how many of us asked questions during explanation of new topics in Physics?

  • We have been brainwashed to not ask any questions to our teachers. Otherwise we will be in trouble.

  • Throughout school science education, we are taught facts and theories, but we are rarely taught to actively challenge our own knowledge and arguments.

  • "Baba Vakyam Pamanam" - Mishra Ji

    • Sawaal nahi puchhna hai
  • From school through college, we are trained to optimize for exams rather than understanding.

  • We memorize conclusions, formulas, and statements, but rarely investigate the reasoning, proof, evidence, or assumptions behind them. As a result, we know what is true, but not how we know it is true.

  • Maths has Proofs and Derivations

  • We often memorize statements in Physics but why don't we do this for maths? Why do we need to prove everything in maths?

  • Society often pressures us to accept claims without questioning them. Mathematics teaches the exact opposite: do not trust a statement merely because an authority made it - understand the proof that makes it true.

  • Human knowledge is a collective effort built over centuries.

  • Teachers are (ideally) filtered/vetted transmitters of that knowledge.

  • But you should not believe a claim merely because a teacher, book, or authority said it. You should understand the proof, reasoning, or evidence behind it.

  • Lets understand some Proof of Truths here

Geometric Construction

  • In school, we had a chapter called Construction in Geometry. In that, we had to construct, triangles, squares, bisectors, circles, etc. Why did we do that?

  • Why do we need to study construction in geometry even though we have proven everything through algebra?

  • We are not going to architecture. We won't be learning CAD in the future. Then why do we construct?

  • None of us have asked this question.

  • Because, it is part of PROOF!

  • Geometry in the visual proof that the shapes like triangles, circles, etc can exist if we follow a particular set of steps to construct them.

  • If the proof exists, it means the shape exists in real world.

  • But somehow after sometime we do not construction in math. Why? The chapter Construction comes and goes by in the later years of life. Why does this happen? Why can we not prove everything through construction?

  • Because, Construction has its limits!

  • If we cannot construct the shape, there will be two possibilities

    • A: The shape does not exist. hence, construction failed!

    • B: There might be some errors in our steps when we tried to construct the shape.

  • Now as we go further the boundaries between these two cases starts blurring. Hence, Construction cannot be become a reliable proof to prove if a shape exists or not.

  • Therefore, we switched to algebra resulting in Algebraic Geometry. We start proving things using algebra.

Physics

  • Now lets come to physics.

  • In school Physics, we learn:

    • Law of Reflection

    • Angle of Incidence (i) = Angle of Reflection (r)

  • Most students:

    • Memorize i = r

    • Solve numerical problems

    • Write it in exams

    • Forget it later

  • The important question is:

    • What does this law explain in the real world?
  • Understanding i = r explains:

    • How mirrors work.

    • How reflective surfaces work.

    • Why road signs are visible at night.

    • Why road reflectors appear bright.

    • Why bicycle reflectors work.

    • Why safety jackets have reflective strips.

  • Road reflectors are not generating light.

    • They reflect light from vehicle headlights.

    • The reflected light travels back towards the driver.

    • This makes roads visible at night.

  • Retroreflectors are specially designed reflectors.

    • They use multiple reflections.

    • Each reflection follows i = r.

    • The final reflected ray travels back toward the source.

  • The formula i = r is not just an exam fact.

    • It explains actual engineering systems used every day.
  • Students usually learn:

    • i = r
  • Researchers ask:

    • Why are road signs visible at night?

    • Why do reflectors shine?

    • Why does retroreflection work?

    • What principle is responsible?

  • A single Physics statement can explain many real-world systems.

  • Don't stop at:

    • "What is the formula?"
  • Ask:

    • "What does the formula explain?"

    • "How is it used in the real world?"

    • "Why is it true?"

Random Number Generators

  • In Semester 1: Maths

    • We learn:

      • Probability

      • Statistics

      • Combinatorics

      • Logic

      • Proofs

    • Most students ask:

      • "Why are we studying this?"

      • "Where will this be used?"

  • In Semester 2: Programming

  • Question:

    Where did this "randomness" come from?

  • Now the maths from Semester 1 suddenly becomes relevant.

  • Some algorithms intentionally use randomness.

  • Two famous categories:

    • Monte Carlo Algorithms

    • Las Vegas Algorithms

Turing Machines

  • A computer can be viewed as a physical implementation of a Turing Machine.

  • Turing Machines are used to model computation.

  • Classical computers are deterministic systems.

  • Same input + same initial state ⇒ same execution path ⇒ same output.

  • Computers execute deterministic instructions.

  • They do not magically create randomness.

  • Yet programming languages provide functions that appear to generate random values.

  • This raises an important question:

    • Where does randomness come from?
  • Most software uses PRNGs.

  • PRNGs are deterministic algorithms.

  • They generate numbers that look random.

  • Given the same seed:

    • Same sequence of numbers is generated.
  • Randomness is simulated, not truly created.

  • TRNGs obtain randomness from physical phenomena.

  • Examples:

    • Thermal noise

    • Electrical noise

    • Radioactive decay

    • Quantum effects

  • Output cannot be reproduced by simply reusing a seed.

  • Provides real entropy.

  • Theory of Computation introduces the concept of a Nondeterministic Turing Machine.

Cryptography

  • AES encryption typically uses:

    • Plaintext

    • Key

    • IV (Initialization Vector)

  • Security guidelines say:

    • Key should be random.

    • IV should be random (or at least unpredictable, depending on the mode).

Question: If Randomness Is Pseudo-Random, Where Do Security Guarantees Come From?

Question: Is There an Acceptable Level of Randomness?

  • No. Cryptography is mathematics. Mathematical guarantees require precise definitions.

  • "Looks random" is not a guarantee.

  • It is secure because attackers don't have enough computing power.

    • No. Security is not simply:

      • "Current computers are too slow."
    • Cryptography aims for stronger guarantees than "nobody can break it today."

Now the question becomes: What property makes something cryptographically secure?

  • A naive answer would be: If the output contains roughly 50% zeros and 50% ones, it is random.

    • But, A sequence can have: 50% zeros and 50% ones. And it can still be predictable.

    • Therefore, Statistical balance alone does not imply security.

  • Random number generators can also be biased.

  • Example mentioned:

  • Mersenne Twister is:

    • Excellent for simulations.

    • Excellent for Monte Carlo methods.

    • Not suitable for cryptographic security.

  • Reason:

    • Future outputs can potentially be predicted if enough outputs are observed.
  • Suppose we have already seen:

    P(1), P(2), P(3), ..., P(n)
    

    Question:

    Can we predict P(n+1)?

    A cryptographically secure generator should ensure:

    Even after seeing all previous outputs, predicting the next bit should be no better than random guessing.

  • A cryptographically secure RNG should provide:

  • The key idea is:

    Security guarantees comes from unpredictability, not merely from statistical randomness.

  • But,

    • "Seekhne ke liye sawaal karna padta hai" - Mishra Ji
  • And we didnt ask any questions!

Database Systems

  • Now lets come to Database Systems

  • Questions:

    • If a process terminates unexpectedly, why isn't all data lost?

    • If a server suddenly shuts down, why is the database still usable after restart?

    • We assume data survives crashes, but what mechanism actually guarantees that?

  • Databases claim:

    • Data integrity.

    • Consistent reads.

    • Reliable writes.

  • Question:

    • Where are these guarantees coming from?

    • Database?

    • Operating System?

    • Filesystem?

    • Storage device?

  • A database service crashing does not automatically mean data loss. Why?

  • What recovery mechanisms make this possible?

  • Why isn't the database corrupted every time a process crashes?

  • For an operation:

UPDATE ...
INSERT ...
  • Possible outcomes:

    • Commit

    • Rollback

  • Nothing in between.

  • Question:

    • Why can't partial updates exist?

    • How does the database guarantee all-or-nothing behavior?

  • Database should always move:

Safe State
↓
Transaction
↓
Safe State
  • Not
Safe State
↓
Half Complete Transaction
↓
Corrupted State
  • Question:

    • What makes a state "safe"?

    • How does the database ensure it never leaves the system in an inconsistent state?

  • Textbooks say:

    Transactions are atomic.

  • Question:

    • What does atomic actually mean?

    • How is atomicity implemented?

    • What mechanisms enforce it?

  • Atomicity means:

All operations succeed
OR
All operations fail
  • No intermediate state should be visible.

  • Database says:

I provide atomic operations.

  • Question:

How does the database guarantee atomic writes?

  • Don't stop at the definition. Ask about the implementation.

  • When an application writes data:

Application
↓
Database
↓
Operating System
↓
Filesystem
↓
Storage Device (SSD/HDD)
  • Many layers exist between the query and the actual disk.

  • A possible sequence can be

Database writes data
↓
OS accepts write
↓
Database receives success
↓
Transaction marked COMMIT
  • But: Data still exists only in cache

  • Does it exist in SSD? Not yet. Does it exist in Permanent Storage? Not yet.

  • Question:

    • What does "success" actually mean?

    • What does "committed" actually mean?

  • Userspace receives confirmation.

  • OS may acknowledge the write.

  • Data may still be in:

    • RAM cache

    • Filesystem cache

    • Controller cache

  • Question:

    • What happens if power is lost before the cache is flushed?
  • Common assumption: Write Success = Data Safely Stored

  • But is this always true?

  • Always Challenge the Assumptions!

  • “Engineer bhau ko fursat nahi hai, Heckur bhau ko assumptions pe focus karna padta hai” - Mishra Ji

Pegasus / FORCEDENTRY: Challenging Assumptions

  • Lets come down to one real example where hackers challenged the assumptions of the engineers.

  • A message contains:

    • An image

    • A GIF

    • A PDF

  • The parser decodes the content.

  • The renderer displays the content.

  • End of story.

  • Assumption:

Image = Data
PDF = Document
Decoder = Renderer
  • Instead of asking:

What does this image contain?

  • Ask:

What is the parser actually doing?

  • The Initial Observation was

    • The attack arrived through iMessage.

    • The attachment appeared to be a GIF.

    • No user interaction was required.

    • Victim didn't need to click anything.

  • Everyone assumed that there would be something in the GIF which was malicious. But then came, Project Zero!

  • They published a blog telling everyone that the image format was Turing Complete!

  • The Reality

    • The file looked like a GIF.

    • It was actually carrying a malicious PDF payload.

    • The apparent file type was not the important part.

  • Lesson: File Extension ≠ Actual Behavior

  • The exploit abused JBIG2, an image compression format used inside PDFs.

  • JBIG2 allows defining symbols and performing operations on them during decoding.

  • NSO discovered that these operations were expressive enough to build:

    • Logic gates

    • Comparisons

    • Arithmetic operations

    • Memory access primitives

  • Project Zero described it as:

Building a computer inside the image decoder.

  • The important distinction:

    • Engineer's Assumption: JBIG2 = Image Compression Format

    • NSO's Observation: JBIG2 = Instruction Set

  • Once you can build:

    • AND

    • OR

    • NOT

    • Conditional behavior

    • Memory manipulation

  • you are approaching the requirements for universal computation.

  • Project Zero demonstrated that the exploit implemented a virtual machine using JBIG2 segments.

  • The exploit performed arbitrary computation during image decoding.

  • Researchers commonly describe the JBIG2 environment as effectively Turing-complete or at least powerful enough for arbitrary computation.

  • Reference: https://probability.ca/jeff/ftpdir/decipherart.pdf

  • The key insight isn't whether someone formally proved Turing completeness.

  • The key insight is:

An image format that engineers thought was merely for compression was powerful enough to execute complex programs.

Illusion of Learning: Forgotten Basics

  • But people would not spend hours strengthening their basics by asking questions.

  • Bros will spend time grinding HTB asking for writeups to solve machines.

    • Bro I solve the insane machine ! In the hindsight, Bro please give me writeup. I need writeup to solve this.

    • They are just memorizing writeups not understanding anything

  • It is the same case for certifications.

    • Bro I got a new shiny cert! In the hindsight, Bro please give me dumps to pass this certification. I need dumps to pass this.

    • They are just memorizing dumps to pass the certification not understanding the material of the certification

  • Thats why, focus on the basics. Ask questions. And lastly,

  • "CS ki ghutti bnake pee lo” - Mishra Ji