Free test of proxies. All geolocations

Choose Astro's best residential, mobile and datacenter proxies.

Try for free

Home
Blog
How to Scrape Instagram with Python and Astro residential proxies: A Step-by-Step Guide

How to Scrape Instagram with Python and Astro residential proxies: A Step-by-Step Guide

02 July 2026

Guides & Setups

Parsing Instagram remains one of the most in-demand tasks in web scraping, competitive analysis, and marketing research. Companies use parsers to monitor public profiles, track brand activity, and automate the collection of open data.

As request volume grows, a key challenge emerges – keeping the parser stable. Sending all requests from a single IP address can eventually lead to rate limits, slower response times, and blocks. Residential proxies solve this problem: requests are distributed across a pool of real IP addresses, keeping the load on any single address minimal.

In the previous article, we covered why proxies are essential for Instagram scraping. In this case study, we'll show how to connect Astro residential proxies to a Python parser and set up collection of public Instagram data. You'll learn how to configure an HTTP client, connect a proxy, correctly handle responses, and prepare the project for scaling.

After reading this article, you'll be able to:

connect Astro proxies to Python;
configure an HTTP client using `httpx`;
request public Instagram pages and verify response correctness;
handle service-side restrictions (redirects, rate limits);
scale the parser without changing the application's architecture.

Important to Know Before You Start

Instagram actively restricts automated access to its pages. In practice, this means:

Without authorization, the service may redirect the request to the login page. The parser must be able to recognize this situation rather than treat it as a successful response.
A significant portion of the content is loaded via JavaScript. A simple HTTP request reliably extracts primarily data from the page's meta tags (profile name, description, follower and post counts from `og:description`). Deeper data collection requires other tools, such as browser automation.
Don't send requests too frequently. Pauses between requests and handling of the 429 status code are a mandatory part of any working parser.
Before launching the project, make sure your data collection complies with the service's terms of use and the laws of your jurisdiction. Work only with publicly available information.

Why Astro Is a Good Fit for Parsing Instagram

When scaling Instagram parsing, it's not just the code that matters – the quality of the proxy infrastructure matters too. If the IP pool is small or rotation is limited, even a well-written parser will eventually run into rate limits and reduced stability.

Astro provides tools that help avoid these problems:

50 million IP addresses across 150 countries let you distribute requests across a large number of real residential addresses and scale data collection without changing the application's architecture.
Multiple IP rotation modes – a new address on every connection, rotation on a timer (starting from 1 minute), or forced rotation via API. This lets you pick the optimal strategy for a specific parsing scenario.
Flexible geo-targeting choose a country, city, or ISP, with the option to use up to 10 countries on a single port or a random country while excluding unwanted regions.
HTTP(S) and SOCKS5 support, plus up to 250 concurrent TCP connections per port, which is convenient for parallel processing of large numbers of profiles.
99.9% uptime helps ensure stable operation for long-running parsers and automated tasks.

A free $3 trial is available, giving you enough credit to test the service and verify that the proxy works with your code before purchasing.

That's why Astro works well both for small Python scripts and for large-scale systems that collect public Instagram data on an ongoing basis.

Connection Specifics: Domain Names Only

Astro proxies handle requests by domain name – direct access to resources by IP address is not available. This doesn't limit our parser in any way: all requests are formed using the domain `www.instagram.com`, and DNS resolution happens on the proxy server's side.

The one thing to keep in mind: don't pre-resolve the domain to an IP address in your code (for example, via `socket.gethostbyname`) and then make a request to that IP – the proxy won't allow such a request through. Always pass a URL with the domain name to the HTTP client.

What You'll Need

Python 3.11 or newer;
the `httpx` library;
the `beautifulsoup4` library (for HTML parsing);
Astro proxy credentials.

Install the dependencies:

```bash

pip install httpx beautifulsoup4

```

Project Structure

```

instagram-parser/

│

├── config.py

├── parser.py

└── requirements.txt

```

Contents of `requirements.txt`:

```

httpx>=0.27

beautifulsoup4>=4.12

```

Step 1. Configure the Proxy

Create a `config.py` file:

```python

PROXY_HOST = "YOUR_PROXY_HOST"

PROXY_PORT = "YOUR_PROXY_PORT"

PROXY_LOGIN = "YOUR_LOGIN"

PROXY_PASSWORD = "YOUR_PASSWORD"

```

In production projects, store credentials in environment variables or a secrets manager, not in the repository code.

Step 2. Connect the Astro Proxy to Python

```python

from urllib.parse import quote

import httpx

from config import PROXY_HOST, PROXY_PORT, PROXY_LOGIN, PROXY_PASSWORD

proxy = (

f"http://{quote(PROXY_LOGIN)}:{quote(PROXY_PASSWORD)}"

f"@{PROXY_HOST}:{PROXY_PORT}"

)

```

Note the `quote()` call: if the login or password contains special characters (`@`, `:`, `/`), the proxy URL will be malformed without escaping.

Create the client using a context manager – this guarantees connections are properly closed:

```python

with httpx.Client(

proxy=proxy,

timeout=30,

follow_redirects=True,

) as client:

...

```

Now all HTTP requests will go through the Astro proxy.

Step 3. Build the List of Profiles

```python

profiles = [

"natgeo",

"nasa",

"nike",

"github",

"instagram",

]

```

Step 4. Request a Profile Page and Validate the Response

```python

HEADERS = {

"User-Agent": (

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "

"AppleWebKit/537.36 (KHTML, like Gecko) "

"Chrome/137.0.0.0 Safari/537.36"

),

"Accept-Language": "en-US,en;q=0.9",

}

def fetch_profile(client: httpx.Client, username: str) -> str:

url = f"https://www.instagram.com/{username}/"

response = client.get(url, headers=HEADERS)

response.raise_for_status()

Instagram may respond with status 200 but redirect to the login page. Such a response contains no profile data – we flag this explicitly.

if "/accounts/login" in str(response.url):

raise RuntimeError("Instagram redirected to the login page")

return response.text

```

Checking the final URL is a critical step. Since the client follows redirects, the login page will come back with a 200 status code, and without this check the parser would treat it as a successful result.

Step 5. Extract Data from the HTML

The meta tags are the most reliable source of data on a public profile page – Instagram places a brief summary of the profile there:

```python

from bs4 import BeautifulSoup

def extract_summary(html: str) -> str | None:

soup = BeautifulSoup(html, "html.parser")

tag = soup.find("meta", property="og:description")

return tag["content"] if tag else None

```

`og:description` typically includes follower, following, and post counts – enough for monitoring the dynamics of public profiles.

Step 6. Run the Parser

```python

import time

def main() -> None:

with httpx.Client(

proxy=proxy,

timeout=30,

follow_redirects=True,

) as client:

for username in profiles:

try:

html = fetch_profile(client, username)

summary = extract_summary(html)

print(f"{username}: {summary or 'no meta data found'}")

except httpx.HTTPStatusError as e:

status = e.response.status_code

if status == 429:

print(f"{username}: rate limit exceeded, pausing for 60 sec")

time.sleep(60)

else:

print(f"{username}: HTTP {status}")

except Exception as e:

print(f"{username}: {e}")

A pause between requests reduces the risk of being blocked.

time.sleep(5)

if __name__ == "__main__":

main()

```

How the Instagram Parser Works Through the Proxy

Python → httpx Client → Astro Proxy → www.instagram.com

All requests pass through the proxy, while the application code barely changes. Requests are made strictly by domain name, so the proxy's restriction against direct IP access has no effect on how the parser works. The same architecture works for both small projects and large-scale web scraping systems – as load grows, it's enough to parallelize the processing of the profile list and expand the proxy pool.

Where This Approach Can Be Used

This example can be adapted for a variety of tasks:

monitoring public Instagram profiles and follower dynamics;
competitor analysis;
open data collection;
marketing research;
building analytics services;
web scraping automation.

Conclusion

Using Astro proxies lets you quickly plug a networking layer into an existing Python parser without changing the application's business logic. The parser in this case study correctly handles login-page redirects, respects pauses between requests, reacts to rate limits from the service, and works exclusively through domain names – fully in line with Astro's proxy connection model.

As the project grows and data volume increases, it's enough to scale the parser itself while keeping the same proxy connection scheme. This approach works well both for small internal tools and for large-scale projects focused on automated collection and analysis of publicly available information.

If you're just getting started with web scraping or looking to scale an existing scraper, the Astro team is here to help. Contact our support team for expert guidance on choosing the right proxy type, optimizing your configuration, testing the service, and getting your project up and running as quickly as possible.

Back

Back to home

Facebook

Copy link

David Melikian / Network engineer and technical writer

A network engineer and technical writer focused on proxy infrastructure, web scraping, and traffic management

More of the important

Proxy pricing

Summer with Astro: Get a 10% Deposit Bonus

Top up your Astro account with $20 or more, enter promo code SUN10, and receive an extra 10% on your deposit. Offer valid until July 24, 2026.

Guides & Setups

Dynamic vs. Static Proxies: What’s the Difference and Which Should You Choose?

Learn the difference between dynamic and static proxies, compare their advantages, and discover why dynamic proxies are the preferred choice for most modern use cases. Explore Astro’s flexible IP rotation modes.

Guides & Setups

Internal vs. External Proxy IP Address: What’s the Difference?

Learn the difference between internal and external proxy IP addresses, where each one is used, and which IP address websites see when you browse through a proxy.