Meta-Analysis of Posts

Reading URLs from test_urls.txt…
Fetching https://www.kaggie.com/facts-about-monkeys/…
Fetching https://www.kaggie.com/frontoremporal-dementia/…
Fetching https://www.kaggie.com/the-basal-lamina/…
Fetching https://www.kaggie.com/beyond-the-bond-unveiling-molecular-architecture-with-the-nuclear-overhauser-effect/…
— Text Analysis Example —

Analyzing frequency of ‘spin’…
www_kaggie_com_facts: 0
www_kaggie_com_front: 0
www_kaggie_com_the_b: 0
www_kaggie_com_beyon: 335
Saved ‘spin_comparison.png’

— TF-IDF Analysis —

Top 20 TF-IDF words for www_kaggie_com_facts:
[(‘monkeys’, 0.0449000708758831), (‘their’, 0.01359560340642929), (‘monkey’, 0.012195081450045109), (‘baboons’, 0.008314828388392925), (‘macaques’, 0.007760506588965654), (‘like’, 0.006942435633391142), (‘world’, 0.006368696689605713), (‘some’, 0.006014880258589983), (‘species’, 0.005681438371539116), (‘primates’, 0.0055432189255952835), (‘them’, 0.005244405008852482), (‘social’, 0.005206826608628035), (‘old’, 0.004988896660506725), (‘evolutionary’, 0.004988896660506725), (‘capuchins’, 0.004988896660506725), (‘spp’, 0.004988896660506725), (‘cultural’, 0.004434574861079454), (‘across’, 0.004370337352156639), (‘these’, 0.004049754235893488), (‘tails’, 0.003933303523808718)]
Plot saved to text1_www_kaggie_com_facts_freq.png

Top 20 TF-IDF words for www_kaggie_com_front:
[(‘ftd’, 0.03387382626533508), (‘dementia’, 0.01935647241771221), (‘frontotemporal’, 0.012904314324259758), (‘behavioral’, 0.01129127573221922), (‘variant’, 0.01129127573221922), (‘clinical’, 0.009678236208856106), (‘diagnosis’, 0.009678236208856106), (‘common’, 0.00926623959094286), (‘speech’, 0.008902170695364475), (‘disease’, 0.00823665689677), (‘progressive’, 0.008065196685492992), (‘symptoms’, 0.008065196685492992), (‘ppa’, 0.008065196685492992), (‘proteinopathies’, 0.008065196685492992), (‘often’, 0.007575757801532745), (‘disorders’, 0.006452157162129879), (‘lobes’, 0.006452157162129879), (‘brain’, 0.006452157162129879), (‘personality’, 0.006452157162129879), (‘semantic’, 0.006452157162129879)]
Plot saved to text2_www_kaggie_com_front_freq.png

Top 20 TF-IDF words for www_kaggie_com_the_b:
[(‘matrix’, 0.026282601058483124), (‘leeds’, 0.01935647241771221), (‘extracellular’, 0.018281111493706703), (‘sciences’, 0.018281111493706703), (‘basal’, 0.01780434139072895), (‘biological’, 0.01780434139072895), (‘faculty’, 0.017205754294991493), (‘cell’, 0.016108691692352295), (‘university’, 0.013565214350819588), (‘lamina’, 0.01271738763898611), (‘https’, 0.011668597348034382), (‘acrobiosystems’, 0.010753595270216465), (‘2023’, 0.010753595270216465), (‘ecm’, 0.008602877147495747), (‘arends’, 0.008602877147495747), (‘lieleg’, 0.008602877147495747), (‘2016’, 0.008602877147495747), (‘com’, 0.007856341078877449), (‘zhang’, 0.0075275166891515255), (‘collagen’, 0.0075275166891515255)]
Plot saved to text3_www_kaggie_com_the_b_freq.png

Top 20 TF-IDF words for www_kaggie_com_beyon:
[(‘noe’, 0.031968988478183746), (‘spin’, 0.012666384689509869), (‘relaxation’, 0.01051491778343916), (‘noes’, 0.009211916476488113), (‘data’, 0.008302845992147923), (‘distance’, 0.006848332937806845), (‘protons’, 0.006333192344754934), (‘cross’, 0.00624228548258543), (‘nmr’, 0.005848355125635862), (‘noesy’, 0.005757447797805071), (‘molecular’, 0.005647747777402401), (‘structure’, 0.005375413224101067), (‘proton’, 0.005302912555634975), (‘experiments’, 0.004545353818684816), (‘dynamics’, 0.00444367527961731), (‘molecules’, 0.00418087700381875), (‘diffusion’, 0.004085313994437456), (‘these’, 0.004063948057591915), (‘between’, 0.004048134665936232), (‘structural’, 0.003984369803220034)]
Plot saved to text4_www_kaggie_com_beyon_freq.png

— N-Gram Analysis (Bigrams) —
2-Gram analysis complete.

Top 3 Bigrams for www_kaggie_com_facts:
(‘of’, ‘the’): 9
(‘Old’, ‘World’): 8
(‘World’, ‘monkeys,’): 8
2-Gram analysis complete.

Top 3 Bigrams for www_kaggie_com_front:
(‘of’, ‘FTD’): 8
(‘FTD’, ‘is’): 6
(‘most’, ‘common’): 6
2-Gram analysis complete.

Top 3 Bigrams for www_kaggie_com_the_b:
(‘Biological’, ‘Sciences,’): 16
(‘Sciences,’, ‘University’): 16
(‘University’, ‘of’): 16
2-Gram analysis complete.

Top 3 Bigrams for www_kaggie_com_beyon:
(‘of’, ‘the’): 331
(‘to’, ‘the’): 136
(‘can’, ‘be’): 134

— PMI Analysis (Collocations) —

Top 20 Collocations (PMI) for www_kaggie_com_facts:
(‘kaggie’, ‘com’): 6.3564
(‘prehensile’, ‘tails’): 5.7968
(‘facts’, ‘about’): 5.4401
(‘old’, ‘world’): 5.1400
(‘new’, ‘world’): 5.1243
(‘such’, ‘as’): 4.9293
(‘species’, ‘like’): 4.0146
(‘behavior’, ‘that’): 3.9585
(‘due’, ‘to’): 3.8044
(‘spider’, ‘monkeys’): 3.7537
(‘one’, ‘of’): 3.5212
(‘world’, ‘monkeys’): 3.5024
(‘about’, ‘monkeys’): 3.2429
(‘the’, ‘most’): 3.2318
(‘a’, ‘behavior’): 3.1726
(‘them’, ‘to’): 2.9289
(‘monkeys’, ‘are’): 2.7729
(‘monkeys’, ‘like’): 2.7729
(‘in’, ‘some’): 2.5177
(‘as’, ‘a’): 2.1693

Top 20 Collocations (PMI) for www_kaggie_com_front:
(‘most’, ‘common’): 4.8828
(‘characterized’, ‘by’): 4.7774
(‘alzheimer’, ‘s’): 4.6821
(‘frontotemporal’, ‘dementia’): 4.4616
(‘the’, ‘most’): 3.4012
(‘ftd’, ‘is’): 2.9369
(‘of’, ‘ftd’): 2.6810
(‘in’, ‘the’): 2.1972

Top 20 Collocations (PMI) for www_kaggie_com_the_b:
(‘arends’, ‘lieleg’): 5.2725
(‘lieleg’, ‘2016’): 5.2725
(‘acrobiosystems’, ‘2023’): 5.1829
(‘et’, ‘al’): 5.1829
(‘al’, ‘2021’): 5.0006
(‘zhang’, ‘et’): 4.8464
(‘sciences’, ‘university’): 4.6523
(‘https’, ‘www’): 4.6523
(‘retrieved’, ‘from’): 4.5951
(‘biological’, ‘sciences’): 4.3803
(‘2016’, ‘faculty’): 4.2429
(‘basal’, ‘lamina’): 4.2178
(‘extracellular’, ‘matrix’): 4.0515
(‘n’, ‘d’): 3.8150
(‘leeds’, ‘n’): 3.7842
(‘d’, ‘b’): 3.7478
(‘nordin’, ‘n’): 3.7478
(‘from’, ‘https’): 3.5537
(‘faculty’, ‘of’): 3.4782
(‘university’, ‘of’): 3.4782

Top 20 Collocations (PMI) for www_kaggie_com_beyon:
(‘vice’, ‘versa’): 9.4452
(‘van’, ‘der’): 9.4452
(‘der’, ‘waals’): 9.4452
(‘machine’, ‘learning’): 9.2629
(‘mass’, ‘spectrometry’): 9.2629
(‘mermaid’, ‘diagram’): 9.1088
(‘xplor’, ‘nih’): 8.9752
(‘natural’, ‘products’): 8.7929
(‘root’, ‘mean’): 8.6568
(‘extreme’, ‘narrowing’): 8.5698
(‘electron’, ‘microscopy’): 8.4897
(‘future’, ‘directions’): 8.3466
(‘source’, ‘material’): 8.2697
(‘square’, ‘deviation’): 8.2697
(‘intrinsically’, ‘disordered’): 8.2384
(’30’, ’80’): 8.2333
(‘i_z’, ‘i_0’): 8.1643
(‘fourier’, ‘transform’): 8.1643
(‘m_’, ‘z’): 8.1643
(‘ray’, ‘crystallography’): 8.1643

— Sentiment Analysis —
Sentiment Score for www_kaggie_com_facts: 10
Sentiment Score for www_kaggie_com_front: 0
Sentiment Score for www_kaggie_com_the_b: 15
Sentiment Score for www_kaggie_com_beyon: 289

— Summarization (Top 3 Sentences) —

Summary for www_kaggie_com_facts:
Facts-about: Monkeys – kaggie.com kaggie.com About Me Topics FAQ Facts-about: Monkeys Apr 19, 2025 — by Josh in AI-Generated , Facts-about Monkeys, a diverse and captivating group of primates, encompass over 260 species within the order Primates, split into New World monkeys (Platyrrhini) and Old World monkeys (Cercopithecidae). Monkeys are broadly categorized into two infraorders based on their geographic distribution and anatomical traits: New World monkeys, found in Central and South America, and Old World monkeys, native to Africa, Asia, and parts of Europe. As frugivores, many monkeys, like spider monkeys and howler monkeys, are vital seed dispersers.

Summary for www_kaggie_com_front:
D. Frontotemporal Dementia. It generally manifests in three ways: Behavioral Variant FTD (bvFTD): The most common form.

Summary for www_kaggie_com_the_b:
(n.d.-a). (n.d.). (n.d.).

Summary for www_kaggie_com_beyon:

  1. 1. 1.

— LDA Topic Modeling (2 Topics) —
Topic 1: often, specific, protein, most, other, regions, solution, intensities, changes, non
Topic 2: matrix, biological, basal, cell, https, leeds, com, sciences, university, extracellular
Topic 3: noe, spin, noes, molecular, relaxation, nmr, dynamics, molecules, diffusion, between
Topic 4: noe, data, structure, these, proton, distance, experiments, time, information, more
Topic 5: monkeys, their, like, monkey, ftd, world, social, some, behavior, these


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *