I am working on a global media monitoring project that tracks news about a specific country (Turkey) across 500+ international sources and multiple regions (Egypt, Libya, Somalia, Ethiopia).
My current approach uses a Boolean search query like this:
(Turkey OR Erdogan OR KAAN OR Fidan OR Ankara OR Istanbul)However, I am facing two main challenges:
Noise/false positives: how can I refine this logic to avoid irrelevant results while ensuring I don't miss niche geopolitical updates?
Automation: are there specific Python libraries or API integration patterns (like using NewsAPI or Inoreader's API) that are better suited for handling such a high volume of sources and country-specific tags?
I need a way to categorize these results by country automatically. What would be the most efficient technical workflow to manage this scale of data?