MinAI - Về trang chủ
Lý thuyết
9/1340 phút
Đang tải...

Content Curation

Tự động thu thập và organize content với n8n - RSS feeds, news aggregation, knowledge management

📚 Content Curation

Content Curation

0

🎯 Mục tiêu bài học

TB5 min

Sau bài học này, bạn sẽ:

✅ Xây dựng RSS feed aggregator với keyword filtering

✅ Implement AI-powered article summarization

✅ Monitor Hacker News và Reddit cho relevant content

✅ Tạo weekly content digest email tự động

Stay informed without drowning in information. Bài này hướng dẫn automate content curation để curate, filter, và deliver relevant content.

1

🔍 Content Curation Overview

TB5 min

Content Sources to Automate:

Content Curation Sources

📚Content Curation Sources
📰News & Blogs
RSS feeds
Tech blogs
Industry news
Company blogs
🐦Social Media
Twitter/X lists
LinkedIn posts
Reddit threads
Hacker News
📊Data & Research
Research papers
Market reports
Industry stats
Competitor news
📧Newsletters
Email newsletters
Substack
Industry digests
Analyst reports

Checkpoint

Có mấy loại content sources chính cần monitor?

2

⚡ Workflow 1: RSS Feed Aggregator

TB5 min

Multi-Feed Reader:

RSS Aggregator

Schedule: Every 6 Hours
📡Fetch Multiple RSS Feeds (Tech News, Industry Blogs, Competitor Updates, Product Hunt)
🔄Deduplicate by URL/Title
🔍Filter by Keywords
Score & Rank
💾Save to Database/Notion
📧Daily Digest Email

RSS Feed Configuration:

JavaScript
1// Define feeds to monitor
2const feeds = [
3 {
4 name: 'TechCrunch',
5 url: 'https://techcrunch.com/feed/',
6 category: 'Tech News',
7 priority: 'high'
8 },
9 {
10 name: 'Hacker News',
11 url: 'https://news.ycombinator.com/rss',
12 category: 'Tech',
13 priority: 'high'
14 },
15 {
16 name: 'Product Hunt',
17 url: 'https://www.producthunt.com/feed',
18 category: 'Products',
19 priority: 'medium'
20 },
21 {
22 name: 'The Verge',
23 url: 'https://www.theverge.com/rss/index.xml',
24 category: 'Tech News',
25 priority: 'medium'
26 },
27 {
28 name: 'Smashing Magazine',
29 url: 'https://www.smashingmagazine.com/feed/',
30 category: 'Design/Dev',
31 priority: 'medium'
32 }
33];
34
35return { feeds };

Content Filtering & Scoring:

JavaScript
1const items = $input.all();
2
3// Keywords to boost
4const boostKeywords = ['ai', 'automation', 'n8n', 'workflow', 'productivity', 'no-code'];
5const mustInclude = []; // If set, article must contain at least one
6const exclude = ['sponsored', 'advertisement', 'promotion'];
7
8const processedItems = [];
9
10for (const item of items) {
11 const article = item.json;
12 const title = article.title.toLowerCase();
13 const description = (article.description || '').toLowerCase();
14 const content = title + ' ' + description;
15
16 // Skip excluded content
17 if (exclude.some(word => content.includes(word))) {
18 continue;
19 }
20
21 // Check must-include keywords
22 if (mustInclude.length > 0) {
23 const hasRequired = mustInclude.some(word => content.includes(word));
24 if (!hasRequired) continue;
25 }
26
27 // Calculate relevance score
28 let score = 0;
29
30 // Boost for keywords
31 for (const keyword of boostKeywords) {
32 if (title.includes(keyword)) score += 10;
33 if (description.includes(keyword)) score += 5;
34 }
35
36 // Boost for recency
37 const pubDate = new Date(article.pubDate);
38 const hoursOld = (Date.now() - pubDate) / (1000 * 60 * 60);
39 if (hoursOld < 6) score += 20;
40 else if (hoursOld < 24) score += 10;
41 else if (hoursOld < 48) score += 5;
42
43 // Boost based on source priority
44 const sourcePriority = article.sourcePriority || 'medium';
45 if (sourcePriority === 'high') score += 15;
46 else if (sourcePriority === 'medium') score += 5;
47
48 processedItems.push({
49 ...article,
50 score,
51 processedAt: new Date().toISOString()
52 });
53}
54
55// Sort by score and take top results
56processedItems.sort((a, b) => b.score - a.score);
57
58return { items: processedItems.slice(0, 30) };

Checkpoint

RSS aggregator sử dụng scoring system với những criteria nào?

3

🔧 Workflow 2: AI-Powered Summarization

TB5 min

Summarize Articles:

Article Summarizer

🔔Trigger: New High-Score Article
📄Fetch Full Article Content
🤖Send to OpenAI for Summary
📌Extract Key Points
🏷️Tag & Categorize
💾Save to Knowledge Base

AI Summary Prompt:

JavaScript
1const article = $input.item.json;
2
3const prompt = `
4Analyze this article and provide:
51. A 2-3 sentence summary
62. 3-5 key takeaways as bullet points
73. Relevance score (1-10) for a tech professional
84. 2-3 relevant tags
9
10Article Title: ${article.title}
11Article Content: ${article.content.substring(0, 3000)}
12
13Respond in JSON format:
14{
15 "summary": "...",
16 "keyTakeaways": ["...", "..."],
17 "relevanceScore": 8,
18 "tags": ["AI", "Automation"]
19}
20`;
21
22return { prompt };

Process AI Response:

JavaScript
1const article = $input.item.json.article;
2const aiResponse = JSON.parse($input.item.json.aiResponse);
3
4return {
5 title: article.title,
6 url: article.link,
7 source: article.source,
8 publishedAt: article.pubDate,
9
10 // AI-generated
11 summary: aiResponse.summary,
12 keyTakeaways: aiResponse.keyTakeaways,
13 relevanceScore: aiResponse.relevanceScore,
14 tags: aiResponse.tags,
15
16 // Metadata
17 savedAt: new Date().toISOString(),
18 status: 'unread'
19};

Checkpoint

AI summarization prompt yêu cầu trả về những thông tin gì?

4

🏗️ Workflow 3: Hacker News Monitor

TB5 min

Track HN Discussions:

Hacker News Monitor

Schedule: Every Hour
📡Fetch HN Top Stories API
🔍Filter by Score & Keywords
Check if Already Saved
🆕New → Fetch Comments, Save to DB, Notify
🔄Exists → Update Score if Changed

HN API Integration:

JavaScript
1// Fetch top stories
2const topStoriesUrl = 'https://hacker-news.firebaseio.com/v0/topstories.json';
3const topStories = await fetch(topStoriesUrl).then(r => r.json());
4
5// Get details for top 30
6const stories = [];
7for (const id of topStories.slice(0, 30)) {
8 const storyUrl = `https://hacker-news.firebaseio.com/v0/item/${id}.json`;
9 const story = await fetch(storyUrl).then(r => r.json());
10
11 if (story && story.type === 'story') {
12 stories.push({
13 id: story.id,
14 title: story.title,
15 url: story.url || `https://news.ycombinator.com/item?id=${story.id}`,
16 score: story.score,
17 comments: story.descendants,
18 by: story.by,
19 time: story.time,
20 hnUrl: `https://news.ycombinator.com/item?id=${story.id}`
21 });
22 }
23}
24
25return { stories };

Filter High-Quality Posts:

JavaScript
1const stories = $input.item.json.stories;
2
3// Filter criteria
4const minScore = 100;
5const minComments = 20;
6const interestingTopics = ['ai', 'startup', 'automation', 'programming', 'show hn'];
7
8const filtered = stories.filter(story => {
9 const title = story.title.toLowerCase();
10
11 // Must meet minimum engagement
12 if (story.score < minScore) return false;
13
14 // Boost for interesting topics
15 const isInteresting = interestingTopics.some(topic => title.includes(topic));
16
17 // Accept if very popular or interesting
18 return story.score >= 200 || (story.score >= minScore && isInteresting) || story.comments >= 50;
19});
20
21return { stories: filtered };

Checkpoint

Hacker News monitor filter posts dựa trên điều kiện gì?

5

📊 Workflow 4: Reddit Monitoring

TB5 min

Track Subreddit Posts:

Reddit Monitor

Schedule: Every 2 Hours
📡Fetch Subreddit Posts (r/automation, r/nocode, r/selfhosted, r/sideproject)
🔍Filter by Upvotes & Keywords
💾Save New Posts
📧Weekly Digest

Reddit API Setup:

JavaScript
1const subreddits = ['automation', 'nocode', 'selfhosted', 'sideproject'];
2const posts = [];
3
4for (const subreddit of subreddits) {
5 const url = `https://www.reddit.com/r/${subreddit}/hot.json?limit=25`;
6
7 const response = await fetch(url, {
8 headers: { 'User-Agent': 'n8n-content-bot/1.0' }
9 });
10 const data = await response.json();
11
12 for (const item of data.data.children) {
13 const post = item.data;
14
15 posts.push({
16 id: post.id,
17 subreddit: post.subreddit,
18 title: post.title,
19 url: post.url,
20 redditUrl: `https://reddit.com${post.permalink}`,
21 score: post.score,
22 comments: post.num_comments,
23 author: post.author,
24 created: new Date(post.created_utc * 1000).toISOString(),
25 flair: post.link_flair_text
26 });
27 }
28}
29
30// Sort by score
31posts.sort((a, b) => b.score - a.score);
32
33return { posts: posts.slice(0, 50) };

Checkpoint

Reddit monitoring theo dõi bao nhiêu subreddits?

6

💡 Workflow 5: Newsletter Curation

TB5 min

Curate Newsletter Content:

Newsletter Curator

📧New Email in 'Newsletters' Label
🔗Extract Links from Email
📂Categorize by Newsletter
💾Store Links with Metadata
📰Weekly: Generate Curated Digest

Extract Links from Emails:

JavaScript
1const email = $input.item.json;
2const body = email.html || email.text;
3
4// Extract all links
5const linkRegex = /<a[^>]+href=["']([^"']+)["'][^>]*>([^<]+)<\/a>/gi;
6const matches = [...body.matchAll(linkRegex)];
7
8// Filter out navigation/unsubscribe links
9const excludePatterns = [
10 'unsubscribe',
11 'mailto:',
12 'twitter.com',
13 'facebook.com',
14 'linkedin.com/share',
15 '#',
16 'javascript:'
17];
18
19const links = matches
20 .map(match => ({
21 url: match[1],
22 text: match[2].trim()
23 }))
24 .filter(link => {
25 const urlLower = link.url.toLowerCase();
26 return !excludePatterns.some(pattern => urlLower.includes(pattern));
27 })
28 .filter(link => link.text.length > 5); // Skip short link texts
29
30return {
31 newsletter: email.from,
32 subject: email.subject,
33 receivedAt: email.date,
34 links: links.slice(0, 20) // Limit per email
35};

Checkpoint

Newsletter curation extract links bằng regex pattern nào?

7

🌟 Workflow 6: Content Digest Generator

TB5 min

Generate Weekly Digest:

Weekly Content Digest

📅Schedule: Sunday 8 PM
📥Get This Week's Saved Content
📂Group by Category/Source
Select Top Items per Category
✍️Generate Digest Email
📧Send to Subscribers

Digest Email Template:

JavaScript
1const content = $input.item.json;
2const weekStart = new Date();
3weekStart.setDate(weekStart.getDate() - 7);
4
5const digest = `
6<!DOCTYPE html>
7<html>
8<head>
9 <style>
10 body { font-family: -apple-system, sans-serif; max-width: 600px; margin: 0 auto; padding: 20px; }
11 .header { text-align: center; border-bottom: 3px solid #4CAF50; padding-bottom: 20px; }
12 .section { margin: 30px 0; }
13 .section-title { color: #333; font-size: 18px; border-left: 4px solid #4CAF50; padding-left: 10px; }
14 .item { padding: 15px 0; border-bottom: 1px solid #eee; }
15 .item-title { font-weight: bold; color: #1a73e8; text-decoration: none; }
16 .item-meta { color: #666; font-size: 12px; margin-top: 5px; }
17 .item-summary { color: #333; margin-top: 8px; }
18 .stats { background: #f5f5f5; padding: 15px; border-radius: 8px; text-align: center; }
19 .stat { display: inline-block; margin: 0 20px; }
20 .stat-value { font-size: 24px; font-weight: bold; color: #4CAF50; }
21 </style>
22</head>
23<body>
24 <div class="header">
25 <h1> Weekly Content Digest</h1>
26 <p>Week of ${weekStart.toLocaleDateString()}</p>
27 </div>
28
29 <div class="stats">
30 <div class="stat">
31 <div class="stat-value">${content.totalItems}</div>
32 <div>Articles Curated</div>
33 </div>
34 <div class="stat">
35 <div class="stat-value">${content.categories.length}</div>
36 <div>Categories</div>
37 </div>
38 <div class="stat">
39 <div class="stat-value">${content.topScore}</div>
40 <div>Top Score</div>
41 </div>
42 </div>
43
44 ${content.categories.map(cat => `
45 <div class="section">
46 <h2 class="section-title">${cat.emoji} ${cat.name}</h2>
47 ${cat.items.slice(0, 5).map(item => `
48 <div class="item">
49 <a href="${item.url}" class="item-title">${item.title}</a>
50 <div class="item-meta">${item.source} ${item.score} points</div>
51 ${item.summary ? `<div class="item-summary">${item.summary}</div>` : ''}
52 </div>
53 `).join('')}
54 </div>
55 `).join('')}
56
57 <div style="text-align: center; margin-top: 40px; color: #666; font-size: 12px;">
58 <p>Curated automatically by n8n</p>
59 <p><a href="#">Manage preferences</a> | <a href="#">Unsubscribe</a></p>
60 </div>
61</body>
62</html>
63`;
64
65return { digest };

Checkpoint

Content digest generator nhóm content theo categories như thế nào?

8

📋 Best Practices

TB5 min
Content Curation Tips

DO:

  • ✅ Define clear criteria for relevance
  • ✅ Deduplicate across sources
  • ✅ Include source attribution
  • ✅ Set reasonable limits (avoid overload)
  • ✅ Review and refine filters regularly

DON'T:

  • ❌ Curate without purpose
  • ❌ Include everything (quality > quantity)
  • ❌ Ignore feedback on usefulness
  • ❌ Forget to respect source terms of service

Checkpoint

Tại sao cần define clear criteria for relevance?

9

📝 Bài Tập Thực Hành

TB5 min
Content Curation Challenge

Build your content system:

  1. Set up RSS feed aggregator (5+ feeds)
  2. Implement keyword filtering and scoring
  3. Add AI summarization for top articles
  4. Create weekly digest email
  5. Track engagement metrics

Stay informed effortlessly! 📚

Checkpoint

Bạn đã hoàn thành những challenges nào trong bài tập?

10

🧠 Key Takeaways

TB5 min
Remember
  • 🎯 Define your focus - Not everything is relevant
  • 📊 Score content - Prioritize what matters
  • 🤖 Use AI wisely - Summarize, don't replace reading
  • 📧 Batch delivery - Digest > constant stream
  • 🔄 Iterate filters - Refine over time

Checkpoint

Tại sao Digest > constant stream?

🚀 Bài tiếp theo

Social Media Automation — Multi-platform posting và engagement.