chris
发布于 2022-07-14 / 1692 阅读 / 0 评论 / 0 点赞

利用Google Sheet爬取Amazon和Walmart产品数据

这两天刷推时候看到一个神奇的视频,如下:

在Excel中填入了Amazon某产品的链接,后面居然可以直接引用这个产品的一些参数,太方便了。相当于一个Amazon的爬虫,知道ASIN就可以获取到产品信息到Excel表格。

跟着操作了一下,首先要使用Google表格
其次要安装ImportFromWeb的插件,否则你没有这个函数,
产品的链接拼接:
https://www.amazon.com/dp/ + 产品的ASIN
操作了一下确实和视频里一模一样,有时候加载数据需要Loading一会。

视频里列出了image / title / asin / sale_price 等字段,好奇究竟支持哪些字段,找到了这个插件的官网的说明文档。这个ImportFromWeb插件不仅支持Amazon产品页,还支持Amazon搜索页面和Walmart的产品页。

一、Amazon产品页支持参数和说明

  1. General Information
    asin : Amazon Standard Identification Number (ASIN)
    title : Product name
    url : Product URL
    brand_name : Product’s brand
    manufacturer : Manufacturer’s name
    model : Item model number
    country_of_origin : Product’s country origin
    categories : All the categories the product fits in (as seen in the breadcrumb)
    categories_links : URLs of the categories the product fits in
    best_seller_main_category : Best seller rank category
    best_seller_main_rank : Best seller rank
    rating : Average rating
    times_evaluated : Number of reviews

  2. Product description
    a_plus_content : Manufacturer’s product description
    bullet_point_X : Bullet point X in product’s description (replace X by your chosen number)
    bullet_points : Product’s description bullet points
    description : Product description

  3. Media
    featured_image_source : Product’s featured image
    image_X_source : Product’s images (replace x by your choosen number)
    other_images_sources : All the product’s images
    has_video : Does the amazon page have a video (true or false)

  4. Offer / price
    availability : Available quantity in Stock
    list_price : Manufacturer Suggested Retail Price
    sale_price : Sale price of the product
    sale_price_per_unit : Unit sale price of the product
    buybox_quantity_max : Maximum selectable quantity
    buybox_winner : Buybox seller
    buybox_winner_link : URL of the Buybox seller page
    vendors_names* : Alternative sellers*
    vendors_prices* : Prices of the alternative sellers*
    vendors_links* : URL of the alternative sellers pages*

  5. Product variations
    current_variation_headers : Headers of all the possible variations
    variation_X_name : Name of variation x
    variation_X_child_images_source : All images of the selected variation (replace x by your chosen number)
    variation_X_asins : All ASINs of the selected variation (replace x by your chosen number)
    variation_X_child_texts : All values of the selected variation (replace x by your chosen number)
    variation_X_child_Y_text : Value y in variation x
    variation_X_child_Y_asin : ASIN of product y in variation x
    variation_X_child_Y_image_source : Image of product y in variation x
    current_variation_values : Selected variations values

  6. Technical characteristics
    feature_headers : Product featured characteristics headers
    feature_values : Product featured characteristics values
    details_headers : Product characteristics headers
    details_values : Product characteristics values
    capacity : Product capacity
    color_name : Product color
    style_name : Product style
    Has_climate_pledge_friendly_badge : Does the amazon product have a climate pledge friendly certification (true or false)

  7. Product’s measurements
    item_dimensions_unit_of_measure : Unit of measure used for the dimensions of the product
    item_height : Product’s height
    item_height_unit_of_measure : Unit of measure used for the height of the product
    item_length : Product’s length
    item_length_unit_of_measure : Unit of measure used for the length of the product
    item_weight : Product’s weight
    item_weight_unit_of_measure : Unit of measure used for the weight of the product
    item_width : Product’s width
    item_width_unit_of_measure : Unit of measure used for the width of the product

  8. Parcel / Package
    package_dimensions_unit_of_measure : Package’s dimension unit of measure (cm, inch)
    package_height : Package’s height
    package_length : Package’s length
    package_weight : Package’s weight
    package_weight_unit_of_measure : Package’s weight unit of measure (grams, pounds)
    package_width : Package’s width

二、Amazon搜索页面支持参数和说明
asin : Collects each product ASIN
title : Collects each product title
price : Collects each product price
rating : Collects average rating for each product listed
reviews : Collects number of reviews collected for each product listed
link* : Collects the URL of each product listed
featured_image_source : Collects the main image source for each product listed

三 、Walmart产品页支持参数和说明
title : Product name
brand : Product brand
brandLink : Link to the brand page of the product
price* : Product price
stars : Average rating
ratings : Number of reviews
categoryN** : Category of the product (replace N by the desired value)
categories : All product categories (make sure to leave empty cells below)
featured_image_source : Collects the main image source for each product listed
out_of_stock : If the product is not available
imageN_source : Image of the product page (replace N by the image number)
images_source : All the images of the product page (make sure to leave empty cells below)


评论