电子说
一、技术选型:为什么选 Python 而不是 Java?

结论:“调研阶段用 Python,上线后如果 QPS 爆表再考虑 Java 重构。”
二、整体架构速览(3 分钟看懂)

三、开发前准备(5 分钟搞定)
环境
Python 3.11 + VSCode + 虚拟环境
依赖一次性装完
bash
python -m venv venv source venv/bin/activate pip install playwright pandas tqdm loguru fake-useragent aiofiles playwright install chromium # 自动下载浏览器
目标字段 & CSS 选择器

四、MVP:120 行代码即可跑通
单文件脚本,支持异步并发 10 个 ASIN,自动重试 429,结果直接写 amazon.csv。
Python
import asyncio, csv, re, random
from pathlib import Path
from playwright.async_api import async_playwright
from loguru import logger
from fake_useragent import UserAgent
import pandas as pd
CONCURRENCY = 10
RETRY = 3
TIMEOUT = 35_000
RESULT = "amazon.csv"
HEADERS = ["asin","title","price","rating","review_count","availability","img_url","scrape_time"]
async def scrape_one(page, asin: str) - > dict:
url = f"https://www.amazon.com/dp/{asin}"
logger.info("
审核编辑 黄宇
全部0条评论
快来发表一下你的评论吧 !