Show HN: Map of YC Startups

yc-map.vercel.app

104 points by yoouareperfect 10 months ago

Hey Everybody! Hope you had a merry christmas

Today I had a bit of fun with Claude.

Started by scraping YC's startups list, then ran them through OpenAI's embedding service, then UMAP'd the embedding to reduce the dimension to just two coordinates and then just forced Claude to write React that would compile to visualize that.

I had fun and I think it's interesting, so take a look!

Also note that you won't be able to zoom on mobile (found about this Plotly limitation way too late). If there's interest I can fix this issue by changing plotting libs tomorrow :)

Merry christmas

Liftyee 10 months ago

Cool project, but missed opportunity to name the arbitrary dimensions Y and C...

  • lovestory 10 months ago

    My dumb ass was trying to figure out what each dimension meant

    • tptacek 10 months ago

      That doesn't make you dumb; there is no intuitive meaning for the axes chosen; you can think of them, roughly, as statistically chosen to maximize clustering.

      • bravura 10 months ago

        Statistically chosen to maximize *some particular loss measure, which in this case might be the t-SNE or UMAP criterion, and is computed only globally and not for different filters.

        • tptacek 10 months ago

          Right (I mean, I'm saying "right" but really I should just say "I'm taking your word for it"), but even more fundamentally this is dimensionality reduction from an OpenAI embedding vector, which seems almost like the asymptotic limit of inscrutability.

  • Bilal_io 10 months ago

    OP made the change

  • yoouareperfect 10 months ago

    haha awesome, shipped!

    • ProofHouse 10 months ago

      I figure why not plot them with an X and Y (Y,C) of some sort

paxys 10 months ago

There's no need to include an X & Y axis, labels and gridlines if they all have no meaning. A simple cluster diagram is enough.

  • ascorbic 10 months ago

    I agree it would be less confusing if they weren't there. I'm sure I'm not alone in spending some time trying to work out what the axes were.

rl_for_energy 10 months ago

It’d be nice to just see the name of the company on click instead of going to the website (I’m on mobile). Trying to find our company

HeyTomesei 10 months ago

Looks nice, but I'm lost. What do the colors represent? What do the axes #s represent?

  • kurayashi 10 months ago

    The colours represent the categories in the filters. Sadly they don’t show which category is which colour.

default_ 10 months ago

Improvement Proposition: Could you plot the investment size on the C-axis and the number of people working for the company on the Y-axis? The chart should be improved; otherwise, it lacks meaning.

rrr_oh_man 10 months ago

Cool concept! What are the X and Y axes?

Oh, and your website has an unchanged Wordpress favicon...

  • tptacek 10 months ago

    They're semi-arbitrary, dimensionally reduced from OpenAI embedding vectors.

crush_robo_1536 10 months ago

Love this! It'd be interesting if some builds this but adds more dimensions (similar to Company status) to it that you can query or group by. For example, if I look at S21 and W21 batches, then it'd be nice to know things like -

1. How many of these companies made it to series A, series B, etc

2. How many of these companies have > x employees (where x can be 5, 10, 20, etc)

3. How many of these companies had a founder that moved on to something else

This does require a lot more intelligent data scraping or manual data collection though.

  • iceman_w 10 months ago

    I've been scraping YC data week over week to track things like changes in founder, pivots in the idea, company shutting down, etc. You can check it out here https://pivots.fyi/

detente18 10 months ago

I wish you could filter by startup name. Would be curious to know where we (litellm) ended up

zild3d 10 months ago

fun, though I also got stuck on what the Y and C axes represent initially. IMO just hide the axes altogether, since the goal is just some visual clustering/similarity

  • skeeter2020 10 months ago

    Maybe I'm slow, but clustering on what dimension? The lack of axes and labeling makes it pretty confusing to me, but I'm a dinosaur.

    Visuals that are not self-explanatory make me feel dumb.

    • gavmor 10 months ago

      We don't know what to label those features/dimensions, because they're a reduction form higher dimensions that we also didn't bother to interrogate.

      It's possible to figure them out. I wish OP would.

      • yoouareperfect 10 months ago

        OP here, Is there a way to figure that out?

        • gavmor 10 months ago

          (Not OP) I can think of a convoluted and expensive pair-wise comparison method, but I hope there's also a way to figure this out during the application of principal component analysis in a way I don't understand.

          Edit: I'm thinking it can't be done without experimentation on the embedding model.

          Edit2: Ah, even that might not yield results, because as the basis is derived interstitially through computation, there's no guarantee the features of the final coordinate system will have any accessible relationship to those of the initial basis.

tmshapland 10 months ago

Really neat! We were Tule, in the industrials part of the map in grey.

There's something wonky when I zoom in on Chrome on my laptop. It abruptly shifts to another part of the map.

im_dario 10 months ago

Amazing project! The only thing I'm missing is able to list the filtered companies as a list. But then it wouldn't be a map, I know.

kure256 10 months ago

Love that, what are Axes Y and C?

  • DrawTR 10 months ago

    Apparently inspired by a comment on this very post! (Above yours, right now.)

    > Cool project, but missed opportunity to name the arbitrary dimensions Y and C...

jb1991 10 months ago

Filters are unreadable on mobile.

uncomplexity_ 10 months ago

hella nice mate very interesting

what's the x and y axes?

  • jerrygenser 10 months ago

    they don't have meaning by themselves. they are two dimensions that umap projected the original embeddings down to in order to show a combination of local neighborhood similarity or closenes

    • gavmor 10 months ago

      Well, they do have meaning by themselves, but it's more work to figure that out. All regular, predictable relationships "have" meaning because all meaning is prescribed. And since we've captured many such prescriptions in LLMs, they can do a decent job approximating those.

woodylondon 10 months ago

Really nice to see - also, It would be great when filtering if there was a tabular view at the bottom as well.

welder 10 months ago

Company status isn't up to date... I know there's more than 1 public company that went through YC.

  • yoouareperfect 10 months ago

    Check the filters, not all batches are selected as default. Only the latest ones. If you select all of them, then there are many public companies

natural219 10 months ago

Wow. This is amazing. Extremely practical to use, I'm glad I checked H.N. yesterday.

k-i-r-t-h-i 10 months ago

This is awesome! Are you able to also add F24?

gniting 10 months ago

Nice! What's the tech stack?

  • yoouareperfect 10 months ago

    For scraping and all the processing, typescript. Embeddings: openai

    For visualizing react (nextjs) + plotly (though the lack of mobile zoom makes me question if I should chsnge it)

mring33621 10 months ago

i'd like a filter by target market (US, EU, APAC...)

ksec 10 months ago

I didn't know YC does Government, Healthcare, Industrials, Real Estate and Construction. All these are great sectors and never made the headline.