我有一个包含对象的数组,我希望遍历这些对象以执行AXIOS调用并使用函数操作响应.不幸的是,最终输出是一个数组,其中包含多个具有相同重复对象的嵌套数组,该数组只有数组报纸的第一个元素的结果.
const newspapers= [{
"name": "CNN",
"address": "https://edition.cnn.com/specials/world/cnn-climate",
"base": "https://edition.cnn.com"
},
{
"name": "The Guardian",
"address": "https://www.theguardian.com/environment/climate-crisis",
"base": "https://www.theguardian.com"
}, etc...]
// Initiate global variable for the results
let articles = [];
// Function to remove duplicates, get img if present and consolidate data
function storeData(element, base, name) {
const results = [];
element.find("style").remove();
const title = element.text();
const urlRaw = element.attr("href");
const url =
urlRaw.includes("www") || urlRaw.includes("http") ? urlRaw : base + urlRaw;
// Check for duplicated url
if (tempUrls.indexOf(url) === -1) {
// Check for social media links and skip
if (!exceptions.some((el) => url.toLowerCase().includes(el))) {
tempUrls.push(url);
// Get img if child of anchor tag
const imageElement = element.find("img");
if (imageElement.length > 0) {
// Get the src attribute of the image element
results.push({
title,
url,
source: name,
imgUrl: getImageFromElement(imageElement),
});
} else {
results.push({
title,
url: url,
source: name,
});
}
}
}
return results;
}
// Cheerio function
function getElementsCheerio(html, base, name, searchterms) {
const $ = cheerio.load(html);
const termsAlso = searchterms.also;
const termsOnly = searchterms.only;
const concatInfo = [];
termsAlso.forEach((term) => {
$(`a:contains("climate"):contains(${term})`).each(function () {
const tempData = storeData($(this), base, name);
tempData.map((el) => concatInfo.push(el));
});
});
termsOnly.forEach((term) => {
$(`a:contains(${term})`).each(function () {
const tempData = storeData($(this), base, name);
tempData.map((el) => concatInfo.push(el));
});
});
return concatInfo;
}
// API
app.get("/news", (req, res) => {
// Query String
const query = checkForQuery(req);
const wordsToSearch = query ? verifyQuery(query) : "";
Promise.all(
newspapers.map(({ name, address, base }) =>
axios
.get(address, {
headers: { "Accept-Encoding": "gzip,deflate,compress" },
})
.then((res) => {
const html = res.data;
console.log({ name, address, base });
const scrappedElements = getElementsCheerio(
html,
base,
name,
wordsToSearch
);
scrappedElements.map((item) => articles.push(item));
return articles;
})
)
).then((articles) => {
res.json(articles);
});
});
当我记录循环时,我看到它正在正确地通过,但是从第一份报纸检索到的相同的两篇文章也出现在所有其他报纸上:
console.log / result:
{
name: 'CNN',
address: 'https://edition.cnn.com/specials/world/cnn-climate',
base: 'https://edition.cnn.com'
}
[{title: article1,
url: article1,
source: article1,
imgUrl: article1},
{title: article2,
url: article2,
source: article2,
imgUrl: article2}]
{
name: 'The Times',
address: 'https://www.thetimes.co.uk/environment/climate-change',
base: 'https://www.thetimes.co.uk'
}
[{title: article1,
url: article1,
source: article1,
imgUrl: article1},
{title: article2,
url: article2,
source: article2,
imgUrl: article2}]
etc...
我怎么才能解决这个问题呢?为什么即使包含另一家报纸信息的新对象正在经过,它总是从第一个开始收集相同的文章?
我们对任何帮助都深表感谢.我是一名前端开发人员,这么做是为了学习,我知道我可能缺乏一些基本知识来避免这个愚蠢的问题.提前谢谢您!