I am developping a Chrome extension which, among other things, gets a (long) list of URLs in an array (or even at the console). I need to 'extract' the URLs matching the (arbitrary) patterns with the less entries in that list, completely discarding the patterns that have the most entries (say, more than 5 items).
I can't use a classic / standard regex for that, since, as I said, the patterns can be entirely different from one extension execution to another (aka they're arbitrary), but I still need to create those patterns (or ar least sort / group the list according to those patterns) in order to be able to eliminate the patterns having the most items in my list.
How do I do that in Javascript? My instinct is to either:
- Make some regex patterns from the entries (but their number is also arbitrary, so at first sight one can't know how many patterns to create), count the items belonging to each pattern and keep only the items that belong to patterns with no more than 5 items
- Compare the items' strings between each other, the way a similar names duplicate eliminator (e.g. dupeGuru) would and keep only the 'similar duplicates' that have at most 5 items (but that would probably be beyond the scope of the extension and most likely doable in an easier / faster way)
- Sort the items in the list so that the patterns with the most items are the last, then take the first, say, 10 items in the list (aka those who belong to the patterns with the least entries) and process them further. This would be my preferred solution and probably one of the fastest, but the problem is that a simple alphabetical sort wouldn't work in every case, as the patterns with the most entries could be alphabetically anything (plus, they could be more patterns having lots of items).
Test case:
var list =
[
"http://ift.tt/2wFVFHm",
"http://ift.tt/2y35fFF",
"http://ift.tt/2wFyX2i",
"http://ift.tt/2y4KWIg",
"http://ift.tt/2wFyXzk",
"http://ift.tt/2y35gcH",
"http://ift.tt/2wFVGeo",
"http://ift.tt/2y4KYjm",
"http://ift.tt/2wFyY6m",
"http://ift.tt/2y4KYQo",
"http://ift.tt/2wFyYDo",
"http://ift.tt/2y4KZUs",
"http://ift.tt/2wFVHPu",
"http://ift.tt/2y35ikP",
"http://ift.tt/2wFyZaq",
"http://ift.tt/2y35iRR",
"http://ift.tt/2wFVImw",
"http://ift.tt/2y35joT",
"http://ift.tt/2wFVITy",
"http://ift.tt/2y35kJt",
"http://ift.tt/2wFVJXC",
"http://ift.tt/2y4L22A",
"http://ift.tt/2wFVKuE",
"http://ift.tt/2y4L3na",
"http://ift.tt/2wFVLyI",
"http://ift.tt/2y4L3Uc",
"http://ift.tt/2wFVM5K",
"http://ift.tt/2y4L62k",
"http://ift.tt/2wFz0Lw",
"http://ift.tt/2y35oJd",
"http://ift.tt/2wFVMTi",
"http://ift.tt/2y4jREW",
"http://ift.tt/2wFz1iy",
"http://ift.tt/2y41y2J",
"http://ift.tt/2wFVNXm",
"http://ift.tt/2y4L7Dq",
"http://ift.tt/2wFz1PA",
"http://ift.tt/2y4L8as",
"http://ift.tt/2wFz2mC",
"http://ift.tt/2y3liTZ",
"http://ift.tt/2wFz2TE",
"http://ift.tt/2y54w76",
"http://ift.tt/2wGgXVa",
"http://ift.tt/2y4LaiA",
"http://ift.tt/2wFz3Hc",
"http://ift.tt/2y4LaPC",
"http://ift.tt/2wGfOgt",
"http://ift.tt/2y4LbDa",
"http://ift.tt/2wFz4uK",
"http://ift.tt/2y4Lcac",
"http://ift.tt/2wGDMbo"
]
In this case, I would like to keep only:
var result =
[
"http://ift.tt/2wFVFHm",
"http://ift.tt/2y35fFF",
"http://ift.tt/2wFyX2i",
"http://ift.tt/2y4jREW",
"http://ift.tt/2wFz1iy"
]
since these items belong to the patterns with 5 items at most.
A simple code sample or an explained method would be great. It would also greatly help (as in the difference between an upvoted answer and an accepted one, if both methods are present) if that method / code can do this 'on the fly' (aka at runtime), since I get the list items asynchronously, one after another, not all at the same time...
Aucun commentaire:
Enregistrer un commentaire