--> Web Scraping is a method that is used to take out the information available on the website/web application using bots.
--> One good example for web scraping is the websites that provide the real-time information about Hotel Prices/ Flight Ticket Prices.
--> We can use ASM to block the web scraping attacks by using following methods:
1) Bot Detection:
--> By checking human events such as mouse clicks and keyboard strokes by the clients,The ASM can detect whether the client is a bot or not.
--> By checking how many web pages client surfs over the period of time.
--> We need to configure 4 settings in order to implement Bot Detection for Web Scraping:
i) Rapid Surfing:
--> Here you can tell to ASM how many maximum pages can be accessed by the client within a specific time.
--> By default, A client can access 120 Maximum Page refreshes or 30 different pages can be loaded over 30 seconds.
--> If any client tries to access more than the configured settings then the ASM thinks the client as bot and blocks the request.
ii) Grace Interval:
--> This is the Interval which is used by ASM to identify the client as bot or human.
--> By default, ASM can identify the provided client is human or bot within 100 requests.
--> If ASM detects the bot during the grace interval then it activates the blocking period.
--> If the ASM detects the bot before the grace Interval then also it activates the Blocking Period.
iii) Blocking Period:
--> Once the ASM identifies the provided client is bot then how long the requests from the client need to be blocked is defined here.
--> By default, After detecting the client as bot then the ASM is going to block 500 web requests.
--> Once the Blocking period is over then the ASM is again going to activate the Grace Interval to detect the client is bot or human.
iv) Safe Interval:
--> If ASM identifies the provided client as a human during the Grace Interval then Safe Interval is activated by ASM.
--> During the Safe Interval, The ASM is not going to block any web requests which are coming from the client.
--> By default this setting is set to 2000 requests.
--> Once the Safe Interval is over then ASM is going to activate the grace Interval to check again the client is bot or human.
2) Session Opening:
--> By checking how many sessions have been opened from the IP address which is trying to access the web application.
--> By checking how many requests to the web application which does not have the ASM cookie.
--> If ASM finds that the number of sessions opened from the specific client is more than the value which is set in Session Opening then the ASM is going to send Javascript to detect the client is bot or human.
--> If the client responds to the javascript and sends ASM cookie in the response then the ASM understands the client as human.
--> We can also configure rate limiting which limits the sessions from the detected client.
--> If IP Intelligence is enabled then the ASM is going to check the IP address in the IP intelligence database, If ASM finds the client IP address in the database then it is going to drop the request.
3) Session transactions anomaly
--> By checking the sessions which consume most of the resources compared to normal session.
--> ASM is going to create a database table that contains 5000 records that provide the information about the transactions of each session.
--> Session transaction count is based upon the TS Cookie.
--> If there are no transactions from a client within 15 minutes then it is going to be removed from the Session transaction table.
--> If the transaction table becomes full then the older entries will be automatically removed from the table.
4) Fingerprinting and Suspicious Clients
--> Fingerprinting is used to get the attributes/information about the browser from the IP address.
--> By using Suspicious clients feature we can detect the extensions used by the browsers to perform web scraping.
--> By default, ASM does not prevent web scraping which is happening from the search engine bots such as Google and Yahoo for example.
--> You can add Search Engine bot Information by navigating Security > Options > Application Security > Advanced Configuration > Search Engine.
--> The most important thing is the F5 should be able to perform DNS lookup in order to check the traffic originating from the search engines.
Note: You can also enable Persistent Device Identification to track the clients who are avoiding web scraping by resetting the session by deleting the cookies
--> To configure Web Scraping on the BIG IP ASM, Navigate to Security > Application Security > Anamoly Detection > Web Scraping
--> You can check the Web Scraping logs by navigating to Security > Event Logs > Application > Web Scraping Attacks
Ref: F5.com,
Md.Kareemoddin
0 comments:
Post a Comment