|
|
|
Scraping LinkedIn Public Profiles for Fun and Profit |
Si vous voulez bloquer ce service sur vos fils RSS
Si vous voulez nous contacter ou nous proposer un fil RSS
Menu > Articles de la revue de presse : - l'ensemble [ tous | francophone] - par mots clé [ tous] - par site [ tous] - le tagwall [ voir] - Top bi-hebdo de la revue de presse [ Voir]
Présentation : Reconnaissance and Information Gathering is a part of almost every penetration testing engagement. Often, the tester will only perform network reconnaissance in an attempt to disclose and learn the company's network infrastructure i.e. IP addresses, domain names, and etc , but there are other types of reconnaissance to conduct, and no, I'm not talking about dumpster diving. Thanks to social networks like LinkedIn, OSINT WEBINT is now yielding more information. This information can then be used to help the tester test anything from social engineering to weak passwords. In this blog post I will show you how to use Pythonect to easily generate potential passwords from LinkedIn public profiles. If you haven't heard about Pythonect yet, it is a new, experimental, general-purpose dataflow programming language based on the Python programming language. Pythonect is most suitable for creating applications that are themselves focused on the flow of the data. An application that generates passwords from the employees public LinkedIn profiles of a given company - have a coherence and clear dataflow 1 Find all the employees public LinkedIn profiles 2 Scrap all the employees public LinkedIn profiles 3 Crunch all the data into potential passwords Now that we have the general concept and high-level overview out of the way, let's dive in to the details. Finding all the employees public LinkedIn profiles will be done via Google Custom Search Engine, a free service by Google that allows anyone to create their own search engine by themselves. The idea is to create a search engine that when searching for a given company name - will return all the employees public LinkedIn profiles. How When creating a Google Custom Search Engine it's possible to refine the search results to a specific site i.e. 'Sites to search' , and we're going to limit ours to linkedin.com. It's also possible to fine-tune the search results even further, e.g. uk.linkedin.com to find only employees from United Kingdom. The access to the newly created Google Custom Search Engine will be made using a free API key obtained from Google API Console. Why go through the Google API because it allows automation No CAPTCHA's , and it also means that the search-result pages will be returned as JSON as oppose to HTML . The only catch with using the free API key is that it's limited to 100 queries per day, but it's possible to buy an API key that will not be limited. Scraping the profiles is a matter of iterating all over the hCards in all the search-result pages, and extracting the employee name from each hCard. Whats is a hCard hCard is a micro format for publishing the contact details of people, companies, organizations, and places. hCard is also supported by social networks such as Facebook, Google , LinkedIn and etc. for exporting public profiles. Google when indexing parses hCard, and when relevant, uses them in search-result pages. In other words, when search-result pages include LinkedIn public profiles, it will appear as hCards, and could be easily parsed. Let's see the implementation of the above usr bin python Copyright C 2012 Itzik Kotler scraper.py is free software you can redistribute it and or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or at your option any later version. scraper.py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with scraper.py. If not, see lthttp www.gnu.org licenses . Simple LinkedIn public profiles scraper that uses Google Custom Search import urllib import simplejson BASE_URL https www.googleapis.com customsearch v1 key ltYOUR GOOGLE API KEY cx ltYOUR GOOGLE SEARCH ENGINE CX def __get_all_hcards_from_query query, index 0, hcards url query if index 0 url url ' start pourcentsd' pourcents index json simplejson.loads urllib.urlopen url .read if json.has_key 'error' print Stopping at pourcentss due to Error pourcents url print json else for item in json 'items' try hcards item 'pagemap' 'hcard' 0 'fn' item 'pagemap' 'hcard' 0 'title' except KeyError as e pass if json 'queries' .has_key 'nextPage' return __get_all_hcards_from_query query, json 'queries' 'nextPage' 0 'startIndex' , hcards return hcards def get_all_employees_by_company_via_linkedin company queries ' at pourcentss inurl in ', ' at pourcentss inurl pub ' result for query in queries _query query pourcents company result.update get_all_hcards_from_query BASE_URL ' q ' _query return list result Replace ltYOUR GOOGLE API KEY and ltYOUR GOOGLE SEARCH ENGINE CX in the code above with your Google API Key and Google Search Engine CX respectively, save it to a file called scraper.py, and you're ready To kick-start, here is a simple program in Pythonect that utilizes the scraper module that searchs and prints all the Pythonect company employees full names Pythonect - scraper.get_all_employees_by_company_via_linkedin - print The output should be Itzik Kotler In my LinkedIn Profile, I have listed Pythonect as a company that I work for, and since no one else is working there, when searching for all the employees of Pythonect company - only my LinkedIn profile comes up. For demonstration purposes I will keep using this example i.e. Pythonect company, and Itzik Kotler employee , but go ahead and replace Pythonect with other, more popular, companies names and see the results. Now that we have a working skeleton, let's take its output and start crunching it. Keep in mind that every password generation forumla is merely a guess. The examples below are only a sampling of what can be done. There are, obviously many more possibilities and you are encouraged to experiment. But first, let's normalize the output - this way it's going to be consistent before operations are performed on it Pythonect - scraper.get_all_employees_by_company_via_linkedin - string.lower ''.join .split The normalization procedure is short and simple convert the string to lowercase and remove any spaces, and so the output should be now itzikkotler As for data manipulation, out of the box Thanks to The Python Standard Library we've got itertools and it's combinatoric generators. Let's start by applying itertools.product Pythonect - scraper.get_all_employees_by_company_via_linkedin - string.lower ''.join .split - itertools.product , repeat 4 - print The code above will generate and print every 4 characters password from the letters i, t, z, k, o, t, l , e, r. However, it won't cover passwords with uppercase letters in it. And so, here's a simple and straightforward implementation of a cycle_uppercase function that cycles the input letters yields a copy of the input with letter in uppercase def cycle_uppercase i s ''.join i for idx in xrange 0, len s yield s idx s idx .upper s idx 1 To use it, save it to a file called itertools2.py, and then simply add it to the Pythonect program after the itertools.product , repeat 4 block, as follows Pythonect - scraper.get_all_employees_by_company_via_linkedin - string.lower ''.join .split - itertools.product , repeat 4 - itertools2.cycle_uppercase - print Now, the program will also cover passwords that include a single uppercase letter in it. Moving on with the data manipulation, sometimes the password might contain symbols that are not found within the scrapped data. In this case, it is necessary to build a generator that will take the input and add symbols to it. Here is a short and simple generator implemented as a Generator Expression postfix for postfix in '123',' ',' ' To use it, simply add it to the Pythonect program after the itertools2.cycle_uppercase block, as follows Pythonect - scraper.get_all_employees_by_company_via_linkedin - string.lower ''.join .split - itertools.product , repeat 4 - itertools2.cycle_uppercase - postfix for postfix in '123',' ',' ' - print The result is that now the program adds the strings '123', ' ', and ' ' to every generated password, which increases the chances of guessing the user's right password, or not, depends on the password To summarize, it's possible to take OSINT WEBINT data on a given person or company and use it to generate potential passwords, and it's easy to do with Pythonect. There are, of course, many different ways to manipulate the data into passwords and many programs and filters that can be used. In this aspect, Pythonect being a flow-oriented language makes it easy to experiment and research with different modules and programs in a plug and play manner.
Les derniers articles du site "Security Bloggers Network" :
- The best hacking wireless hacking resource on the web Hackin9 - Get ready for the next sophospuzzle coming soon to a T-shirt near you - An Ode To Glass - Critical Internet Explorer Updates Released - Analyzing Malicious PDFs or How I Learned to Stop Worrying and Love Adobe Reader Part 3 - Events for the week of May 19-25, 2013 - Cornucopia Ecommerce Website Edition v1.00 - Life Inside a Skinner Box 5 The Mixed Blessing of Perfect Law Enforcement - VMware Security Tip 20 - SCADA and ICS systems are now self-healing
Menu > Articles de la revue de presse : - l'ensemble [ tous | francophone] - par mots clé [ tous] - par site [ tous] - le tagwall [ voir] - Top bi-hebdo de la revue de presse [ Voir]
Si vous voulez bloquer ce service sur vos fils RSS :
- avec iptables "iptables -A INPUT -s 88.190.17.190 --dport 80 -j DROP"
- avec ipfw et wipfw "ipfw add deny from 88.190.17.190 to any 80"
- Nous contacter par mail
| Mini-Tagwall des articles publiés sur SecuObs : | | | | sécurité, exploit, windows, outil, attaque, réseau, microsoft, metasploit, audit, vulnérabilité, système, virus, internet, usbsploit, données, protocol, présentation, linux, source, réseaux, bluetooth, scanner, reverse, conférence, shell, meterpreter, vista, rootkit, engineering, mobile, security, wishmaster, malicieux, https, trames, paquet, noyau, téléphone, détection, botnet, forensic, libre, snort, utilisant, sysun |
| Mini-Tagwall de l'annuaire video : | | | | curit, security, biomet, metasploit, biometric, cking, password, windows, botnet, defcon, tutorial, crypt, xploit, exploit, lockpicking, linux, attack, wireshark, vmware, rootkit, conference, network, shmoocon, backtrack, virus, conficker, elcom, etter, elcomsoft, server, meterpreter, openvpn, ettercap, openbs, iphone, shell, openbsd, iptables, securitytube, deepsec, source, office, systm, openssh, radio |
| Mini-Tagwall des articles de la revue de presse : | | | | security, microsoft, windows, hacker, attack, network, vulnerability, google, exploit, malware, internet, remote, iphone, server, inject, patch, apple, twitter, mobile, virus, ebook, facebook, vulnérabilité, crypt, source, linux, password, intel, research, virtual, phish, access, tutorial, trojan, social, privacy, firefox, adobe, overflow, office, cisco, conficker, botnet, pirate, sécurité |
| Mini-Tagwall des Tweets de la revue Twitter : | | | | security, linux, botnet, attack, metasploit, cisco, defcon, phish, exploit, google, inject, server, firewall, network, twitter, vmware, windows, microsoft, compliance, vulnerability, python, engineering, source, kernel, crypt, social, overflow, nessus, crack, hacker, virus, iphone, patch, virtual, javascript, malware, conficker, pentest, research, email, password, adobe, apache, proxy, backtrack |
|
|
|
|
|