Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

chikim@mastodon.socialC

chikim@mastodon.social

@chikim@mastodon.social
About
Posts
6
Topics
2
Shares
0
Groups
0
Followers
0
Following
0

View Original

Posts

Recent Best Controversial

  • My earlier post about converting documents to Markdown with Microsoft MarkItDown got a lot of boosts on Mastodon.
    chikim@mastodon.socialC chikim@mastodon.social

    @twynn I'm not sure about alt. I meant headings. I doubt it would keep the image alt desc. I don't even know if markdown has alt tag feature for images.

    Uncategorized

  • My earlier post about converting documents to Markdown with Microsoft MarkItDown got a lot of boosts on Mastodon.
    chikim@mastodon.socialC chikim@mastodon.social

    @twynn Yes like it preserves headings, whereas markit down dropped the headings from boht PDF and docx. Also I havne't tried, but Docling has automatic image caption feature with vllm.

    Uncategorized

  • My earlier post about converting documents to Markdown with Microsoft MarkItDown got a lot of boosts on Mastodon.
    chikim@mastodon.socialC chikim@mastodon.social

    My earlier post about converting documents to Markdown with Microsoft MarkItDown got a lot of boosts on Mastodon. On Reddit, people said Docling is better, so I tried it and I agree. Docling does a much better job preserving structure and tags, and it is definitely worth checking out! https://docling-project.github.io/docling/

    Uncategorized

  • Just discovered Microsoft has a tool to convert documents (pdf, docx, pttx, xlsx, html, outlook messages...) to markdown as well as transcribe audio and Youtube links!
    chikim@mastodon.socialC chikim@mastodon.social

    @x0 @marvellousmachine @ondrosik A lot of people also mentioned that docling is better! It might be worth to check out.

    Uncategorized

  • Just discovered Microsoft has a tool to convert documents (pdf, docx, pttx, xlsx, html, outlook messages...) to markdown as well as transcribe audio and Youtube links!
    chikim@mastodon.socialC chikim@mastodon.social

    @x0 @marvellousmachine @ondrosik Not sure if Pandoc has support for OCR, out look messages, speech transcription, LLM support for MCP server, etc. Total speculation, but I suspect they created specifically to digest all kinds of documents for LLM training.

    Uncategorized

  • Just discovered Microsoft has a tool to convert documents (pdf, docx, pttx, xlsx, html, outlook messages...) to markdown as well as transcribe audio and Youtube links!
    chikim@mastodon.socialC chikim@mastodon.social

    Just discovered Microsoft has a tool to convert documents (pdf, docx, pttx, xlsx, html, outlook messages...) to markdown as well as transcribe audio and Youtube links! https://github.com/microsoft/markitdown

    Uncategorized
  • Login

  • Login or register to search.
  • First post
    Last post
0
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups