Cloud Security Interview Prep

In cloud computing, security responsibilities are split between the cloud provider and the customer. The provider always owns the underlying infrastructure: physical security, hardware, hypervisor, and the global network fabric. The customer is responsible for what they put on top of that infrastructure.

The split shifts based on the service model:

IaaS (Infrastructure as a Service — EC2, Azure VMs): Most customer responsibility. You own OS patching, runtime configuration, application security, network rules, data, and identity. The provider owns physical infrastructure and virtualization.

PaaS (Platform as a Service — AWS Lambda, Azure App Service): Provider handles OS and runtime; you own applications, data, and IAM.

SaaS (Software as a Service — Salesforce, Google Workspace): Provider handles almost everything. You own access management, user configuration, and how you use and share data within the platform.

Misunderstanding this model — assuming the provider handles security you're actually responsible for — is the root cause of many cloud breaches.

What they want to hear: The shift from IaaS → PaaS → SaaS in terms of customer responsibility, and that misconfiguring this assumption is a real breach cause. Cloud security interviews often start here.

Least privilege means granting every user, service, and workload only the minimum permissions needed to perform their function — nothing more.

In AWS: Use IAM roles (not long-term access keys or root), avoid wildcard permissions (* in actions or resources), use Service Control Policies (SCPs) in AWS Organizations to set permission guardrails at the account level, and run IAM Access Analyzer to identify over-permissive policies.

In Azure: Use RBAC with built-in roles over custom roles where possible, apply Conditional Access policies for human identities, and use Privileged Identity Management (PIM) for just-in-time admin access — admins request elevated access for a time-limited window rather than holding it permanently.

The operational importance: when a workload's credentials are compromised (which happens regularly through code repos, environment variables, and SSRF vulnerabilities), least privilege limits the blast radius to only what that workload could access legitimately.

What they want to hear: Specific controls in at least one cloud provider (IAM roles, SCPs, PIM), and the blast radius argument for why it matters operationally, not just theoretically.

A cloud misconfiguration is an incorrectly configured cloud resource that creates a security exposure — a publicly accessible S3 bucket containing sensitive data, a security group allowing all inbound traffic (0.0.0.0/0 on port 22), CloudTrail logging disabled, hardcoded credentials committed to a repository, or storage without encryption at rest.

It happens frequently because:

Developer IAM access without security training: Engineers who provision infrastructure often have broad permissions but no security background. They optimize for what works, not what's secure.

Insecure defaults: Cloud providers' default settings are often permissive — S3 buckets used to default to public, security groups often default to wide-open. Security requires opting in.

Speed over security: Cloud moves fast. Teams deploying multiple times a day skip security reviews. One misconfiguration in a Terraform module gets replicated across 50 environments.

Scale: A single misconfigured IaC template can instantiate across hundreds of accounts or regions before anyone notices.

What they want to hear: Multiple root causes beyond "people make mistakes." The IaC replication point (one template, hundreds of resources) is the scale factor most candidates miss.

CSPM (Cloud Security Posture Management) is a tool that continuously scans your cloud environment — across accounts, subscriptions, projects, and regions — and compares your configuration against security best practices and compliance frameworks.

It identifies: publicly exposed storage, over-permissive IAM policies, disabled logging, unencrypted storage volumes, security groups with wide-open rules, and controls mapped to compliance frameworks like CIS Benchmarks, NIST 800-53, PCI-DSS, and SOC 2.

Most CSPM tools integrate with ticketing systems (Jira, ServiceNow) to route findings to remediation owners, track fix timelines, and report on SLA compliance.

Examples: Wiz, Orca Security, AWS Security Hub + Config, Microsoft Defender for Cloud (formerly Azure Security Center), Prisma Cloud (formerly RedLock). Modern platforms (called CNAPP — Cloud-Native Application Protection Platform) combine CSPM with CWPP and other capabilities.

What they want to hear: What it scans for, how it integrates with compliance frameworks, and at least 2 product names. Mentioning CNAPP as the evolution of standalone CSPM signals current market awareness.

CSPM (Cloud Security Posture Management): Monitors cloud infrastructure configuration for misconfigurations and compliance gaps. Answers the question: "Is my S3 bucket public? Is MFA enforced? Is logging enabled?" It operates at the control plane level.

CWPP (Cloud Workload Protection Platform): Protects running workloads — VMs, containers, serverless functions — from runtime threats. Answers: "Is malware running in my container? Is there anomalous process activity in this EC2 instance? Are my container images vulnerable?" It operates at the data plane (runtime) level.

CASB (Cloud Access Security Broker): Sits between users and SaaS applications to enforce access policies, detect data exfiltration through cloud apps, and discover shadow IT. Answers: "Is sensitive data leaving via Dropbox? Which SaaS apps is my organization using without IT approval?" Operates at the application access level.

Modern CNAPP (Cloud-Native Application Protection Platform) products combine CSPM + CWPP + more into a unified platform — Wiz and Prisma Cloud are primary examples.

What they want to hear: Clear distinctions with a one-sentence purpose for each, and CNAPP as the modern convergence. This is a standard cloud security vocabulary question — knowing all three and their relationship is baseline for mid-level roles.

Move fast. Public repository credential exposures are scraped by automated bots within seconds to minutes. The clock starts the moment the commit is pushed — not the moment you're notified.

Immediately rotate or revoke the exposed credentials — don't wait to investigate first. Invalidate the key, then investigate. Revocation takes 30 seconds; unauthorized use can happen in the meantime.

Check CloudTrail for unauthorized use from the credential's creation time through revocation. Look for API calls from unfamiliar source IPs, unusual regions, or unusual services (DescribeInstances followed by CreateUser is a classic recon-then-privilege-escalation pattern).

If unauthorized use is found, scope the impact and treat as a cloud account compromise incident.

Remediate the root cause with the developer: Remove credentials from code, use environment variables or a secrets manager (AWS Secrets Manager, HashiCorp Vault), and add a pre-commit hook or CI check (git-secrets, truffleHog) to prevent credential commits in the future. Frame it as a system improvement, not a blame event.

What they want to hear: Revoke first, investigate second — the ordering matters. CloudTrail as the investigation source. And a remediation that prevents recurrence, not just cleans up this instance.

Meet them where they work. Join their Slack, attend their standups, learn their deployment pipeline and tooling. Security that lives outside the engineering workflow is security that gets skipped.

Make secure defaults easy. Provide pre-approved Terraform modules with security controls baked in, security linting in the CI pipeline (checkov, tfsec), and self-service documentation that answers "what's the approved way to do X?" rather than "please submit a request and we'll review it."

Reduce friction, don't just add it. For every thing you tell engineers they can't do, offer an approved alternative. "That IAM policy is too permissive — here's the policy I'd use to accomplish the same thing" is more useful than "that's a finding."

Track outcomes, not just findings. Measure time-to-remediation (improving), not just finding count (always high). Celebrate teams that fix things fast — make security success visible. Engineers respond to recognition, not just compliance scorecards.

What they want to hear: That security has to earn trust through usefulness, not mandate it through authority. The "approved alternative" framing and pipeline integration are the most concrete signals of experience working in cloud-native environments.

Translate risk to financial terms. Not "we need CSPM" but "we have X accounts and Y resources with no centralized misconfiguration detection. Public S3 misconfigurations were the root cause in Z% of cloud breaches last year per [IBM/Verizon DBIR/Gartner]. A single misconfiguration event at our scale could cost [notification + fine + remediation + customer churn estimate]."

Show the gap before and after. Document your current coverage — what attack vectors do you have no visibility into? What compliance requirements are you manually checking? The tooling closes specific named gaps, not abstract risk.

Reference industry context. IBM Cost of a Data Breach Report (annual, sector-specific), Gartner cloud security market data, and peer company incidents are all legitimate inputs. If a competitor was breached via a misconfiguration you currently can't detect, that's directly relevant.

Close with a specific ROI frame: "This tool costs $X/year and replaces Y hours of manual audit effort per quarter, while also providing coverage we currently have no substitute for."

What they want to hear: Specific financial framing, a coverage gap narrative, and an ROI structure. Candidates who can connect tooling to business risk (not just technical coverage) are far more effective in cloud security roles.

Make it private immediately. Every second of continued exposure is additional risk. Remove public access block exemptions, set the bucket policy back to private, and confirm the change took effect.

Determine the exposure window: Check S3 server access logs and CloudTrail to find when the public access was enabled, who enabled it (API call and source identity), and what was accessed by external IPs during that window (GetObject requests from non-internal IPs).

Scope the data impact: What data was in the bucket and how sensitive is it? PII triggers different obligations than internal documents. Determine record count, data categories, and whether encryption was in use (encrypted-at-rest data accessed via unauthenticated S3 is still problematic — the data is readable via HTTP).

Notify legal immediately. Most jurisdictions have mandatory breach notification timelines (72 hours under GDPR, 30 days under many US state laws) from when you "knew or should have known" of a breach. That clock may already be running. Do not wait for the full technical investigation before engaging legal.

Conduct a broader audit of all bucket policies across all accounts — this is rarely a one-bucket problem.

What they want to hear: Remediate first, then investigate. Legal notification as an early step (not an afterthought). The broader audit — this is almost never isolated to one bucket — shows operational experience.

This pattern is consistent with cloud enumeration — an attacker (or overpermissioned user) mapping your environment before taking action. Describe/List/Get calls are read-only reconnaissance, but they precede most cloud compromise patterns.

Identify the IAM identity: Which access key or role made these calls? Is it attached to a known user or service? Is the key from an active, expected deployment or something that should have been rotated?

Characterize the source IP: Is it a known corporate IP range, a cloud provider IP (legitimate cross-account call), or an external IP? Look it up in threat intel — is it a Tor exit node, a known attacker infrastructure IP, a commercial VPN?

Review the full session: Were any write operations (CreateUser, AttachPolicy, RunInstances) made? Read-only enumeration that escalates to write operations is the critical inflection point. If write operations occurred, treat this as an active compromise.

If suspicious: Rotate or revoke the access key immediately (not "after we investigate further"), then investigate from CloudTrail what was accessed during the entire window the key was potentially compromised.

What they want to hear: Identify the IAM identity, check the source IP in threat intel, look for escalation to write operations, and rotate the key as a decisive action rather than a last resort. This is cloud incident triage, not just alert review.

Tips for this role

Know at least one cloud provider deeply. AWS is most common in job postings, Azure is common in enterprise environments. Pick one and be able to describe specific services and security controls in detail (IAM, VPC, CloudTrail, Security Groups, KMS). "I've used AWS" isn't enough — "I've configured SCPs, managed GuardDuty findings, and built a CSPM workflow with Security Hub" is.

Understand IaC security. Most cloud environments are provisioned with Terraform, CDK, or CloudFormation. Being able to discuss IaC security review (checkov, tfsec, OPA policies) shows you can actually operate in a modern cloud engineering environment, not just on the security team's side of it.

Lead with "how do we enable the business." Cloud security roles that see themselves as gatekeepers get bypassed. The most effective cloud security engineers are the ones who make secure patterns the path of least resistance for developers. Frame your experience in terms of how you reduced friction while improving security, not just how you blocked things.

Practice with foyl Learn

Cloud Security — Interview Prep