Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
L
LAE
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
文靖昊
LAE
Commits
caf42399
Commit
caf42399
authored
8 months ago
by
文靖昊
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
文档去重
parent
e85ee1bd
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
10 additions
and
2 deletions
+10
-2
web.py
src/controller/web.py
+2
-0
get_similarity.py
src/server/get_similarity.py
+8
-2
No files found.
src/controller/web.py
View file @
caf42399
...
...
@@ -176,6 +176,7 @@ def question(chat_request: ChatRequest, token: str = Header(None)):
j
=
{}
j
[
"page_content"
]
=
d
.
page_content
j
[
"from_file"
]
=
d
.
metadata
[
"filename"
]
j
[
"page_number"
]
=
0
docs_json
.
append
(
j
)
# answer = "test Answer"
if
session_id
==
""
:
...
...
@@ -220,6 +221,7 @@ def re_generate(chat_request: ReGenerateRequest, token: str = Header(None)):
j
=
{}
j
[
"page_content"
]
=
d
.
page_content
j
[
"from_file"
]
=
d
.
metadata
[
"filename"
]
j
[
"page_number"
]
=
0
docs_json
.
append
(
j
)
# answer = "reGenerate Answer"
...
...
This diff is collapsed.
Click to expand it.
src/server/get_similarity.py
View file @
caf42399
...
...
@@ -62,10 +62,16 @@ class GetSimilarityWithExt:
def
get_text_similarity_with_ext
(
self
):
similarity_docs
=
[]
for
q
in
self
.
question
:
print
(
q
)
similarity_doc
=
self
.
faiss_db
.
get_text_similarity
(
q
)
similarity_docs
.
extend
(
similarity_doc
)
return
similarity_docs
content_set
=
set
()
unique_documents
=
[]
for
doc
in
similarity_docs
:
content
=
hash
(
doc
.
page_content
)
if
content
not
in
content_set
:
unique_documents
.
append
(
doc
)
content_set
.
add
(
content
)
return
unique_documents
DEFAULT_PROMPT
=
"""作为一个向量检索助手,你的任务是结合历史记录,从不同角度,为“原问题”生成个不同版本的“检索词”,从而提高向量检索的语义丰富度,提高向量检索的精度。生成的问题要求指向对象清晰明确,并与“原问题语言相同”。例如:
...
...
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment