There is a site https://m.imdb.com/list/ls000984564/. I'm trying to get all names of actors from there (there are 35 of them). I found the pattern that allows me to perform this work but the problem is that I can get only the half of the list through the one, and the first element can differ from time to time. So one time I can get 1,3,5,...,35 names and other time 2,4,6,...,34 names but never all of them at the same time. What do I do wrong? The code is below.
public class Main {
public static void main(String[] args) {
String str = "https://m.imdb.com/list/ls000984564/";
HttpURLConnection urlConnection = null;
try {
URL url = new URL(str);
urlConnection = (HttpURLConnection) url.openConnection();
InputStream inputStream = urlConnection.getInputStream();
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
StringBuilder htmlCode = new StringBuilder();
while (bufferedReader.readLine() != null)
htmlCode.append(bufferedReader.readLine());
urlConnection.disconnect();
ArrayList<String> actorsNamesList = new ArrayList<>();
Pattern pattern = Pattern.compile("<h4>(.*?)</h4>");
Matcher matcher = pattern.matcher(htmlCode.toString());
while (matcher.find())
actorsNamesList.add(matcher.group(1));
for (String name : actorsNamesList)
System.out.println(name);
System.out.println("Size of a list: " + actorsNamesList.size());
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (urlConnection != null) {
urlConnection.disconnect();
}
}
}
}
One of possible results:
Leonardo DiCaprio
Gary Oldman
Johnny Depp
Denzel Washington
Russell Crowe
Robert Downey Jr.
George Clooney
Josh Brolin
Paul Rudd
James Franco
Will Smith
Joseph Gordon-Levitt
Jack Nicholson
Stanley Tucci
Jamie Foxx
Tommy Lee Jones
Eric Bana
Jon Hamm
Size of a list: 18
Another one that I'm fequently getting in Android Studio:
2019-12-07 02:22:20.837 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Christian Bale
2019-12-07 02:22:20.837 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Mark Wahlberg
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Matt Damon
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Daniel Day-Lewis
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Steve Carell
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Edward Norton
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Brad Pitt
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Ryan Reynolds
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Geoffrey Rush
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Ken Watanabe
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Aaron Eckhart
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Clive Owen
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Will Ferrell
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Benicio Del Toro
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: James Gandolfini
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Josh Hartnett
2019-12-07 02:22:20.838 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Greg Kinnear
2019-12-07 02:22:20.840 13840-13840/com.rumato.gratestfilmamericanlegends I/URL: Size: 17
Aucun commentaire:
Enregistrer un commentaire