AnalyseIndexItem

Deprecated - Scheduled for removal - Replaced by DownloadBackLinks

Status

Command status: DEPRECATED
Supported by OpenApps API: No
Supported by Internal/Reseller API: Yes
Possibly queued processing: Yes

Purpose

This command will run actual analysis of index items using optional filters and return the desired analysis results broken down into the following categories:

  1. Source URLs
  2. Target URLs (selected top URLs from supplied domain/subdomain or actual URL that was given)
  3. Anchor text
  4. Referring domains
  5. ACRank distributions
  6. Link type distributions

Even though this command will attempt to execute requests in real time, when domains have a large number of backlinks it will have to queue processing the same way DownloadBackLinks command works.

Resources consumed

Resource Description

AnalysisResUnits

This resource will be decreased by a number of external backlinks that were analysed by this command.

RetrievalResUnits

This resource will be decreased by a number of backlinks retrieved (returned) by this command.

Parameters

Please specify SkipIfAnalysisCostGreaterThan to avoid unintended analysis of very large domains (such as google.com) and/or call GetIndexItemInfo command to get analysis cost before actually calling this command. To find out more, please read Common Problems.

Parameter Description

cmd

Required: must be set to: AnalyseIndexItem

datasource

Optional - defaults to historic
Either: "fresh" - to query against Fresh Index, or "historic" - to query against Historic Index.

items

Required: must be a positive integer from 0 to maximum of 20 indicating how many items this query supplies in itemN parameters

item0, item1 ... itemN

One or more items starting from item0 that information

Note: do not forget to URL encode each item even if item may already contain URL encoding.

SkipIfAnalysisCostGreaterThan

If this parameter is set, any index items which have a greater analysis cost than this unit will not be analysed. Setting this parameter prevents accidentally analysing extremely large index objects.

For this reason, we highly recommended setting this parameter.

Default: -1 (not set)

Target URLs

MaxTargetURLs

Sets maximum number of target urls (belonging to the analysed domains or subdomains) returned by the command. In case of analysis requested for a specific URL only data on that URL will be returned.

If set to -1 then maximum allowed number of urls will be returned. If set to 0 then no elements will be returned at all.

If not specified then default number of 20 will be assumed.

Internal maximum: 10000 (subject to change upwards)

Source URLs

MaxSourceURLs

Sets maximum number of source URLs to be returned by analysis command for each of the items.

If set to -1 then all available data will be returned, otherwise only top elements will be returned. If set to 0 then no elements will be returned at all.

If not specified then default number of 20 will be assumed.

MaxSourceURLsPerRefDomain

If set to greater than 0, then this value will limit number of source urls taken from any given referring domain.

If set to 1, then it will effectively produce list of referring domains with just 1 best backlink from each of them.

Default: -1 (not set)

MaxSameSourceURLs

Some index items (usually domains) may have more than one link out from any given source url, sometimes with different anchor text, sometimes with different flags. This parameter is designed to control maximum number of such backlinks returned.

If set to greater than 0, then it will set a limit to number of same source urls returned.

Note: Usage of this option will result in some anchor text/flags combinations not returned back.

If set to 1 it will guarantee that only unique source urls returned.

Default: -1 (not set)

Example:

http://www.example.com (source URL) links to http://www.example.org with anchor text "example" http://www.example.com (source URL) links to http://www.example.org with anchor text "another example"

If this parameter is set to 1, then only the first source URL/anchor text combination will be returned. It is undefined which precise anchor text/flags combination will get priority in such case. It is recommended to use filtering flags to ensure undesired backlinks are eliminated before this parameter takes effect.

ACRank Distributions

EnableACRankDistributions

If set to 1 then ACRank distributions will be created, any other value will switch them off.

Default: 1

Anchor text

MaxAnchorText

If set to -1 then all available data will be returned, otherwise only top elements will be returned.

If set to 0 then no elements will be returned at all.

If not specified then default number of 20 will be assumed.

AnchorTextSortBy

Controls how top anchor texts are selected:

Values:

0: highest ACRank of backlink (default)
1: highest number of external backlinks
2: highest number of referring domains
3: highest number of referring IPs

Referring domains

MaxRefDomains

If set to -1 then all available data will be returned, otherwise only top elements will be returned.

If set to 0 then no elements will be returned at all.

If not specified then default number of 20 will be assumed.

RefDomainsSortBy

Controls how top anchor texts are selected:

Values:

  1. highest ACRank of backlink from that referring domain (default)
  2. highest number of backlinks from that domain
Link type distributions

EnableLinkTypeDistributions

If set to 1, then link type distributions will be created for analysed index items.

Link type distributions show counts of backlinks with all link flag (ie nofollow, image) combinations.

Set bit Flag/Type

0

NoFollow

1

Deleted

2

TextLink

3

ImageLink

4

Redirect

5

Frame

6

Mention

Note: more flags will be added in the future, this may result in shifting of bits, however they will always retain the same header names. Some combinations of flags never happen, ie Redirects can't be Frames at the same time.

Default: -1 (not enabled)

Backlink Trends

EnableBackLinkTrends

If set to 1 then backlink trends will be returned showing discovery rates for external backlinks and referring domains by month.

: -1 (not enabled)

Debug parameters
(use for testing purposes only)

DebugForceQueue

If set to 1 then analysis request will be forced to queue request thus simulating request made to analyse a large index item

Advanced parameters
to control how queued responses will be handled
(optional - but managing of queued responses must be supported)

NotifyURL

The notification URL specified must be accessible from outside your intranet - do not specify internal servers that can't be accessed from our servers or this will fail.

Optional: if specified this URL (with HTTP/HTTPS protocols only) will be requested to notify you that the download has been fully prepared. The URL you provide can contain query string parameters to help you identify the download request that was made, we will also substitute (if they are present) the following macro variables (case sensitive): %%DOWNLOAD_FILE%% - will be changed to the download filename %%DOWNLOAD_FILE_LOCATION%% - will be changed to the download filename location using PublicDownloadLocation variable below

Your notify URL should respond with single piece of data: OK (2 characters - no HTML) to indicate that you have successfully received this notification. Any other response (including server failures on your side) will be treated as an error. In the case of an error, the notification URL will be called again a number of times using exponential backoff before failing.

Example of notification url: http://www.example.com/mjseonotify.php (this URL doesn't actually exist)

UploadToFTP

Important: this FTP URL must be accessible from outside your intranet - do not specify internal servers that can't be assessed from our side: if you do then you will never get the upload!

Optional: if not specified prepared file with backlinks will be uploaded to majestic.com, this behaviour can be overridden by specifying your own FTP server using appropriate URI format such as:

ftp://username:password@yourftpserver.com/public_html/uploads/ with the relative path of the upload directory or ftp://username:password@yourftpserver.com//home/user/public_html/uploads/ with the absolute path of the upload directory.

Make sure that the user specified is allowed to write files in the directory that you designated for these uploads. If a trailing filename is specified, this will be prepended to the output filename when it is upload to your server.

If not specified the upload will be made to www.majestic.com from where you will be able to download the file.

PublicDownloadLocation

Optional: by default prepared backlinks file will be available from URL: http://www.majestic.com/downloads/ - IF you use alternative FTP location and know which public URL would correspond to it then you are advised to supply it here unless you are planning to analyse the data locally.

Filtering rules

Filtering rules enable targeting processing of backlinks in order to retrieve data that matches specified criteria. For example a particular analysis request may wish to focus on backlinks marked as "nofollow" or exclude such links from any analysis. Filtering rules can be combined, ie: analyse only backlinks found in a particular time period that exclude those of them that were marked as "nofollow".

Rule name Description

Target URLs filtering rules

URLs

One or more Target URLs delimited by CR LF - only meaningful for domain level analysis, this will force analysis to run only on backlinks pointing to these URLs. Exact matching of URLs will be performed.

IncludeMatchedURLs

Comma delimited list of strings that will be required to be matched in Target URLs in order for them to be analysed.

ExcludeMatchedURLs

Same as IncludeMatchedURLs but matched Target URLs will be excluded.

Source URL filtering rules

FlagIncludeDeleted (also works old flag: FlagIncludeOldCrawl)

FlagIncludeNoFollow
FlagIncludeRedirect
FlagIncludeImage
FlagIncludeFrame
FlagIncludeAltText
FlagIncludeMention

Includes into analysis source URLs that had at least one of the following flags set:

  • Deleted (also works OldCrawl) - links that were present on a page but after recrawl found to be deleted (removed)
  • NoFollow - links marked with "nofollow"
  • Redirect - links that were redirecting
  • Image - links that were image based rather than text
  • Frame - source URL was used in a frameset (useful to find if someone frames content)
  • AltText - text used in title attribute of A tag
  • Mentions - text mentions of a domain or link

See more about Source Flags: http://www.majestic.com/support/glossary#SourceFlags

FlagExcludeDeleted (also works old flag: FlagExcludeOldCrawl)

FlagExcludeNoFollow
FlagExcludeRedirect
FlagExcludeImage
FlagExcludeFrame
FlagExcludeAltText
FlagExcludeMention

Same as FlagInclude* filtering rules, only setting them will result in exclusion of source URLs with specified flags set.

Recommended use: exclude non-rank passing backlinks such as those marked with Deleted, NoFollow, Mention, AltText, Frame, Redirect flags.

IncludeMatchedRefDomains

If set then only backlinks for matching referring domains will be retrieved. Referring domains can be comma delimited to provide multiple referring domains of interest.

Example of valid referring domains: example.com,example.net

It's possible that there will be no backlinks for specified referring domains in which case empty file (with headers) will be produced.

IncludeExactAnchorText

If set then only backlinks with exactly matching (lower cased) anchor text will be selected.

Example: "yahoo" (without quotes) - this will only match anchor texts that are exactly "yahoo", it won't match "yahoo!" or anything different.

Use | (pipe) to separate multiple anchor texts, though this is not recommended.

ExcludeExactAnchorText

Same as IncludeExactAnchorText but matched anchor text items will be excluded from analysis.

IncludeContainsAnchorText

If set then only backlinks with anchor text containing (lower cased) specified anchor texts.

Example: yahoo

This will match anchor texts that contain that word, ie: "yahoo!", "click here to go yahoo" etc

Use | (pipe) to separate multiple anchor texts, though this is not recommended.

ExcludeExactAnchorText

Same as IncludeExactAnchorText but matched anchor text items will be excluded from analysis.

IncludeContainsAnchorText

If set then only backlinks with anchor text containing (lower cased) specified anchor texts.

Example: yahoo

This will match anchor texts that contain that word, ie: "yahoo!", "click here to go yahoo" etc

Use | (pipe) to separate multiple anchor texts, though this is not recommended.

ExcludeContainsAnchorText

Same as IncludeContainsAnchorText but matched anchor text items will be excluded from analysis.

IncludeMatchedRefTLDs

Comma delimited list of TLDs (Top Level Domains) that will be included in analysis, ie: edu,gov - note there is no . in front of TLD.

ExcludeMatchedRefTLDs

Same as IncludeMatchedRefTLDs but matched referring TLDs will be excluded from analysis.

EnableBackLinkDateRange

If set to 1 then date range analysis will be enabled.

IncludeBackLinksDateFrom

Date in format of DD/MM/YYYY (ie: 05/02/2008 for 5 February 2008) - links from Source URLs with first found date starting at that moment will be included in analysis.

Requires EnableBackLinkDateRange to be set to 1.

IncludeBackLinksDateTo

Same as IncludeBackLinksDateFrom only this is the cut off date for backlinks analysis by date.

Requires EnableBackLinkDateRange to be set to 1.

IncludeMatchedIPs

Includes links from Source URLs with matched IP addresses of the resolved referring domains that matched comma delimited list of IPs. Prefix matching is used, you can match subnets by leaving out the end of the IP, ie: 212.34.4.

ExcludeMatchedIPs

Same as IncludeMatchedIPs but matched referring TLDs will be excluded from analysis.

IncludeMatchedGeoCountry

Includes backlinks referring domains of which were geo-located in specified countries.

Use two letter country codes or NA for not geo-located IPs or unresolved domains.

List of country codes: http://www.maxmind.com/app/iso3166

ExcludeMatchedGeoCountry

Same as IncludeMatchedGeoCountry but matched referring TLDs will be excluded from analysis.

Sample queries and response

There are 2 possible scenarios for returned data:

  1. All requested information will be returned in real-time during the same request in form of response XML
  2. Request will be queued by the server, processed separately and then generated response in a GZIP compressed response XML that will be uploaded to specified FTP server or Majestic.com website from which it can be downloaded

Requesting XML

This is a protocol-level example query that uses a special URL that was overridden to have zero cost of analysis (you will need to use your own API_KEY to analyse other urls):

Example responses

XML response

<?xml version="1.0" encoding="UTF-8"?>
<Result Code="OK" ErrorMessage="" FullError="">
<GlobalVars AnalysisResUnits="97071101" IndexBuildDate="2017-09-04 13:42:54" IndexType="0" RetrievalResUnits="19462795" ServerBuild="2017-10-13 13:57:22" ServerName="HUMMERR" ServerVersion="1.0.6495.23321" UniqueIndexID="20170904134254-HISTORICAL" />
<DataTables Count="6">
<DataTable Name="ItemResults" RowsCount="1" Headers="ItemNum|Item|ResultCode|ResultDescription|TotalBackLinks|ExtBackLinks|RefDomains|IPs|SubNets">
<Row>0|http://www.majestic.com/comparedomainbacklinkhistory.php|OK| |1|0|0|0|0</Row>
</DataTable>
<DataTable Name="TargetURLs" RowsCount="1" Headers="ItemNum|TargetUrlID|URL|Title|ACRank|ExtBackLinks|RefDomains|IPs">
<Row>0|0|http://www.majestic.com/comparedomainbacklinkhistory.php| |0|0|0|0</Row>
</DataTable>
<DataTable Name="ACRankDistributions" RowsCount="0" Headers="ItemNum|Type|ObjectID|BestACRank|BestACRankCount|R0|R1|R2|R3|R4|R5|R6|R7|R8|R9|R10|R11|R12|R13|R14|R15" />
<DataTable Name="RefDomains" RowsCount="0" Headers="ItemNum|Domain|IP|GeoCountryCode|TotalBackLinks|FlagRedirect|FlagFrame|FlagNoFollow|FlagImages|FlagDeleted|FlagAltText|FlagMention|BestACRank|DomainCitationFlow|DomainTrustFlow" />
<DataTable Name="SourceURLs" RowsCount="0" Headers="ItemNum|TargetUrlID|SourceURL|ACRank|AnchorText|Date|FlagRedirect|FlagFrame|FlagNoFollow|FlagImages|FlagDeleted|FlagAltText|FlagMention" />
<DataTable Name="AnchorText" RowsCount="0" Headers="ItemNum|AnchorText|ExtBackLinks|RefDomains|IPs|BestACRank|BestACRankCount" />
</DataTables>
</Result>

JSON response

{
"Code": "OK",
"ErrorMessage": "",
"FullError": "",
"AnalysisResUnits": 97071051,
"IndexBuildDate": "2017-09-04 13:42:54",
"IndexType": 0,
"RetrievalResUnits": 19462795,
"ServerBuild": "2017-10-13 13:57:22",
"ServerName": "HUMMERR",
"ServerVersion": "1.0.6495.23321",
"UniqueIndexID": "20170904134254-HISTORICAL",
"DataTables": {
  "ItemResults": {
    "Headers": {},
    "Data": [
      {
        "ItemNum": 0,
        "Item": "http://www.majestic.com/comparedomainbacklinkhistory.php",
        "ResultCode": "OK",
        "ResultDescription": "",
        "TotalBackLinks": 1,
        "ExtBackLinks": 0,
        "RefDomains": 0,
        "IPs": 0,
        "SubNets": 0
      }
    ]
  },
  "TargetURLs": {
    "Headers": {},
    "Data": [
      {
        "ItemNum": 0,
        "TargetUrlID": 0,
        "URL": "http://www.majestic.com/comparedomainbacklinkhistory.php",
        "Title": "",
        "ACRank": 0,
        "ExtBackLinks": 0,
        "RefDomains": 0,
        "IPs": 0
      }
    ]
  },
  "ACRankDistributions": {
    "Headers": {},
    "Data": []
  },
  "RefDomains": {
    "Headers": {},
    "Data": []
  },
  "SourceURLs": {
    "Headers": {},
    "Data": []
  },
  "AnchorText": {
    "Headers": {},
    "Data": []
  }
}
}

Returned values

Return value Description
Global variables
Code Code indicating if this command executed successfully.
ErrorMessage A message explaining the error. This will be blank if the code is "OK".
FullError Verbose explanation of error.
IndexBuildDate Date/time the index that was queried was last updated.
IndexType Indicates if the index was Historic (0) or Fresh (1).
RetrievalResUnits Indicates how many retrieval res units the user has remaining.
ServerBuild Date/time the server was built.
ServerName Name of the server queried.
ServerVersion Version of the server queried.
UniqueIndexID Unique identifier for the index consisting of the date and index type.
Common headers

ItemNum

Item number from the original request

Item Item which command is executed with
ResultCode Code indicating status of command.
ResultDescription Verbose description of command status. Will be blank if status is ok.

TotalBackLinks

Total number of backlinks (external + internal) pointing to this item. Please ntoe: Internal Counts are not currently provided, this figure will be equivalent to ExtBackLinks

ExtBackLinks

Number of external backlinks pointing to this item.

RefDomains

Number of referring domains pointing to this item.

IPs

Number of IPs that point to this item.

SubNets

Number of subnets that point to this item.

ItemResults section

Item

Original item that this result refers to

ExtBackLinks

Number of external backlinks pointing to this item

RefDomains

Number of referring domains pointing to this item

IPs

Number of unique IP addresses from resolved referring domains

SubNets

Number of unique C-Class IP subnets from resolved referring domains

TargetURLs

TargetUrlID

Unique TargetURL ID within specified ItemNum that assigned to this TargetURL - it may be referenced by SourceURLs (backlinks) to indicate the target URL they point to

URL

Target URL - it will be in normalised form used in backlinks index, thus possibly slightly different then the URL that was requested, ie: http://www.example.com/ will be shown as http://www.example.com in normalised form (trailing / will be removed in this case).

Title

Title of the target URL if it was a successfully crawled page that had it

ACRank

ACRank value of this URL

ExtBackLinks

Number of external backlinks pointing to this item

RefDomains

Number of referring domains pointing to this item

IPs

Number of unique IP addresses from resolved referring domains

Handling deferred responses

XML response

<?xml version="1.0" encoding="UTF-8"?>
<Result Code="QueuedForProcessing" ErrorMessage="" FullError="">
<GlobalVars ETA="n/a" IndexBuildDate="2017-09-04 13:42:54" IndexType="0" JobID="E5B9EF217D547A29E6649BEA9229A083" Note="Forced queue due to debug request parameter" ReportName="AnalyseDomain" ReportPosition="1" ServerBuild="2017-10-13 13:57:22" ServerName="SHADOJAGUAR" ServerVersion="1.0.6495.23321" TotalReports="1" UniqueIndexID="20170904134254-HISTORICAL" UserID="895472" />
</Result>

JSON response

{
"Code": "QueuedForProcessing",
"ErrorMessage": "",
"FullError": "",
"ETA": "n/a",
"IndexBuildDate": "2017-09-04 13:42:54",
"IndexType": 0,
"JobID": "B73F10EE7F796BE65C8D227531753429",
"Note": "Forced queue due to debug request parameter",
"ReportName": "AnalyseDomain",
"ReportPosition": 1,
"ServerBuild": "2017-10-13 13:57:22",
"ServerName": "SHADOJAGUAR",
"ServerVersion": "1.0.6495.23321",
"TotalReports": 1,
"UniqueIndexID": "20170904134254-HISTORICAL",
"UserID": 895472
}

Related commands

This command is deprecated. Please see the documentation regarding DownloadBackLinks.

To see details on finding the files created by running this command, please see the documentation regarding GetDownloadsList.

To see details on deleting this job, please see the documentation regarding DeleteDownloads.

To see details on how to obtain the cost of analysis, please see the documentation regarding GetIndexItemInfo.

Common problems

Problem Solution
Making requests to very large domains such as google.com uses up resources very quickly. Use SkipIfAnalysisCostGreaterThan parameter to avoid analysing too large domains and/or call GetIndexItemInfo command to get analysis cost first.
Repeated calls to the same domain with slightly varied parameters can quickly use up available resources. Call this command once, then manipulate the data on your end.
Calls to this command yield little to no results. Ensure that your parameters aren't too narrow. In particular, check date range filtering: please consider that our main index is not updated every day.
The response takes a significant amount of time to return. it is recommended to avoid batching items with number of backlinks greater than few millions (use GetIndexItemInfo to check them first) because response will come back only when all such items were processed. Batching is best used for smaller items rather than few very large ones.