[Blogged] Implications of AI powered academic search open.substack.com/pub/aarontay...
15.10.2025 17:00 β π 1 π 0 π¬ 0 π 0@aarontay.bsky.social
I'm librarian + blogger from Singapore Management University. Social media, bibliometrics, analytics, academic discovery tech.
[Blogged] Implications of AI powered academic search open.substack.com/pub/aarontay...
15.10.2025 17:00 β π 1 π 0 π¬ 0 π 0Hmm
15.10.2025 12:49 β π 2 π 0 π¬ 0 π 0We get copilot and gemini 2.5 pro. Libraries in university
15.10.2025 12:47 β π 0 π 0 π¬ 0 π 0Yes.
14.10.2025 14:44 β π 0 π 0 π¬ 0 π 0"Unlike closed ecosystems that require researchers to adopt proprietary tools, Wiley AI Gateway prioritizes intentional interoperability, seamlessly integrating scholarly content and data subscriptions with todayβs leading AI platforms." newsroom.wiley.com/press-releas...
14.10.2025 13:34 β π 0 π 1 π¬ 0 π 1Because of this, you can be assured the views in my blog post are truly mine. That said, I generally try not to be too negative, so if a product doesn't meet my satisfaction, I generally don't mention it.
14.10.2025 13:33 β π 1 π 0 π¬ 0 π 0Just to clarify (since it comes up now and then): I donβt accept compensation or sponsorships for product mentions on my blog. If I write about something, itβs because I genuinely find it interesting or valuable for my readers. I value my independence too much to do otherwise.
14.10.2025 13:27 β π 5 π 0 π¬ 1 π 0Wild guess is besides query expansion with boolean it also does NER so if you type in natural language find me BOOK with TITLE XYZ it will automatically turn on title search and filter to book?
14.10.2025 12:11 β π 1 π 0 π¬ 0 π 0Found this on mailing list. Still hard to visualize what's going on
14.10.2025 11:29 β π 5 π 2 π¬ 1 π 0To be more exact everyone will ingest the free set like openalex, semantic scholar which is broad but lacks depth as it lacks full text and increasingly abstracts . Competition will be around the rest...
14.10.2025 11:26 β π 1 π 0 π¬ 0 π 0Oh. Yeah then might not be comparable. But even then how did you get Google Scholar figure?
14.10.2025 04:14 β π 1 π 0 π¬ 1 π 0I suppose you could do complicated models to predict % chances peer reviewer might accept... then model probably ends up recommending junior peeps?
14.10.2025 04:12 β π 0 π 0 π¬ 1 π 0I am just wildly speculating. I dont understand MCP that well and even more the dynamics between discovery service providers and content providers. I assume most of the free open stuff EG crossref/openalex will be locally indexed and won't be used via MCP in most academic discovery tools
13.10.2025 15:58 β π 0 π 0 π¬ 0 π 0For others who don't have such big indexes i think chances are slim content providers will allow them to ingest content into their systems as this risks losing control (a much greater threat in 2020s vs 2010s). MCP might be a way around it (3)
13.10.2025 15:48 β π 3 π 0 π¬ 2 π 1As of 2025 academic "ai discovery" is essentially RAG over localised central indexes. The question is whether the existing central index eg Exlibris CDI, EDS will secure agreement from content providers to allow RAG over. Already Elsevier + others have opted out of Summon/Primo RA (2)
13.10.2025 15:46 β π 4 π 2 π¬ 1 π 0Looking at MCP (Model Context Protocol) again. It occurs to me this is kinda of the repeat of the 2010s Academic central index vs Federated real time search debate with the former camp led by Summon winning out. This time though I think we unlikely to see a repeat (1)
13.10.2025 15:39 β π 3 π 0 π¬ 1 π 0blog.prophy.ai/how-prophy-m... interesting
13.10.2025 15:02 β π 1 π 1 π¬ 1 π 0People always aske me "What does SIFT for AI look like?" Meaning, what is the minimal set of habits you need to teach students to use it effectively for exploration of claims online? It's taken a couple years to get here but this is a start at a response mikecaulfield.substack.com/p/get-it-in-...
13.10.2025 02:27 β π 23 π 6 π¬ 0 π 1Any more details? Eg. Methods of estimation etc
13.10.2025 12:02 β π 1 π 0 π¬ 1 π 0Our periodic review of the coverage of the major bibliographic databases (October 2025)
GS no longer the largest due to the huge increase of OpenAlex. New data for Xueshu
Note that this analysis applies to before the latest openalex "Walden" rebuild update still in beta. Quick check shows quite different results.
11.10.2025 10:19 β π 5 π 3 π¬ 1 π 0As i learn more on the nuts and bolts of IR eg HNSW, ivf/pq its interesting but for most end users it isnt useful except maybe it makes you understand why its somewhat tricky to implement prefilter + dense embeddings particularly if it isnt setup initially for it. (5)
11.10.2025 09:37 β π 0 π 0 π¬ 0 π 0It also makes a subtle distinction between sparse vector vs sparse "representation". A sparse vector is as you expect most values are zero and usually high dimensional. The sparse representation according to the book refers to the way you store the vector. Eg inverted index/COO/CSR formats. (4)
11.10.2025 09:34 β π 1 π 0 π¬ 1 π 0Also a very nice way of decomposing user intent such that system needs (a) content understanding (b) domain understanding and (c) user understanding (3)
11.10.2025 09:26 β π 1 π 1 π¬ 1 π 0For example I was always somewhat confused when it comes to search vs recommendations but the book frames it as a spectrum which is very nice way to look at it (2)
11.10.2025 09:22 β π 1 π 1 π¬ 1 π 0Finished the first 3 chapters on lexical search and the last 3 on LLM embeddings + RAG. Mostly covering things i knew but I like some of the overall conceptual framework (1)
11.10.2025 09:20 β π 8 π 0 π¬ 1 π 0Next piece. Things i still dont quite fully grasp about the topic.
11.10.2025 09:16 β π 1 π 0 π¬ 0 π 0Really curious about the new natural language search in Primo NDE (not Primo Research Assistant). Hopefully they account for the fact a large proportion of queries in Primo are known item searchs not subject searches
10.10.2025 15:18 β π 5 π 0 π¬ 1 π 1It's ironic to see 2025 publications talking about academic ai search engines saying things like Elicit uses GPT3 and Undermind.ai uses arxiv. (Might want to check if there are more updated sources).
08.10.2025 14:49 β π 2 π 0 π¬ 0 π 0Sorry. All virtual seats for Mike's session are now over. But we still have seats for other events in this series. eventregistration.smu.edu.sg/event/TTT202...
07.10.2025 09:13 β π 1 π 1 π¬ 0 π 0