2.2 Cross-Modal Retrieval

Implement the following retrieval modes: - **Text-to-text**: Standard text search (dense + BM25 hybrid). - **Text-to-image**: Find relevant images given a text query (using CLIP embeddings). - **Image-to-text**: Find relevant text given an image query (using CLIP embeddings). - **Image-to-image**: F