비동기로 웹크롤링

728x90

from bs4 import BeautifulSoup as bs
import aiohttp
import asyncio


async def fetch(session, url, i):
    async with session.get(url) as res:
        html = await res.text()
        soup = bs(html, "html.parser")
        tag = soup.find("span", class_="title")  # 첫번째 글제목만
        print(f"{i+1} : {tag.text}")


async def main():
    BASE_URL = "https://myinbox.tistory.com/"
    urls = [f"{BASE_URL}?page={i}" for i in range(1, 11)]
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(*[fetch(session, url, i) for i, url in enumerate(urls)])


if __name__ == "__main__":
    asyncio.run(main())

참고 : https://github.com/amamov/teaching-async-python/blob/main/3-%EB%8F%99%EC%8B%9C%EC%84%B1-%ED%94%84%EB%A1%9C%EA%B7%B8%EB%9E%98%EB%B0%8D%EC%9C%BC%EB%A1%9C-%EB%8D%B0%EC%9D%B4%ED%84%B0-%EC%88%98%EC%A7%91/03-scraping.py

728x90

'Python' 카테고리의 다른 글

tkinter - 입력란 내용 없으면 버튼 비활성화 (0)	2023.05.11
판다스 치트시트 (0)	2023.04.27
[장고] 알아두면 좋은 내용 (0)	2023.02.14
pandas - 데이터프레임 변동 체크 (0)	2023.01.26
pandas - 특정 문구 포함된 행 삭제 (0)	2023.01.25

나의 인박스

비동기로 웹크롤링

'Python' 카테고리의 다른 글

티스토리툴바

비동기로 웹크롤링

'Python' 카테고리의 다른 글

관련글

티스토리툴바