The main error is that you have css
function and xpath
selector for next_page
:
next_page = response.css("//a[@class='btn-next btn']/@href").get()
The next problem is that you have yielding request of next page inside for
cycle. This will lead to calling a lot of duplicate request.
So I suppose these changes:
def parse(self, response): name1 = input(" Please enter input : ") name1 = name1.lower() links = response.xpath("//div[@class='media-list']/article/a/@href").extract() headers = response.xpath('//div[@class="media-body"]/h5/text()').extract() headers1 = [c.strip().lower() for c in headers] # my changes since this moment: raw_data = zip(headers1, links) # use less variables in loop (yes, just cosmetic, but your code will more readable) for header, link in raw_data: if name1 in header: yield {'page': response.url, 'title': header, 'link': link} # use proper selector here next_page = response.css("a.btn-next::attr(href)").get() # move all this block out of for loop if next_page: yield response.follow(next_page)